Level Definition DTD for PushButtonEngine 0

I use Oxygen for all my XML editing and if you have a DTD or schema it has some pretty neat autocompletion and error highlighting features, with that in mind I’ve created a basic DTD for the level definition file in PBE, you can grab it from here. Note that it’s not ideal, as you get spurious warnings for the fields that you specify inside your component definitions, but there’s not much that can be done about that.

Update (2009-05-24): added an XSD version of the file, this handles unparsed content gracefully, useful for components.

Adding Namespace Support to RvSnoop Comments Off

For RvSnoop I’m currently working on adding namespaces to all of the files that are used to save the application preferences and also the project files. This is a Good Thing in and of itself, as it allows me to use XML Schema to validate and document the files formats. But, more importantly, it gives me the opportunity to refactor as I go alond, the general plan is to make the projects in the next release be based on directories rather than a single XML file, this will allow me to use a disk based storage mechanism for messages so that they will be persistent across sessions.

Playing Well With Others

At the same time, I’ve taken the opportunity to include some of the Apache Commons libraries in the build. There were (well, still are) a number of small utility classes scattered around which I’m planning on replacing with the versions from Apache.

One side effect of this is that the build is increasing in size, to help control this I’m going to remove the Berkeley DB dependency from the build. I was origiannly planning to use this for the on disk message store but I’ve changed my mind here. My current thinking is to just write the messages out to files but to have a set of indexes (probably built using Lucene) as well for searching and sorting.

There should be another (alpha) release out later this week with the Commons and Berkeley DB changes complete, and a release next week with the new project structure.

Configuration via XML or code? 1

When is using XML based configuration files preferable to using source based interfaces and/or classes? I’ve been taking a look at command frameworks such as those that come with the various rich client toolkits out there, and also GUI Commands. It struck me that many things which I would allow via a concrete class or an abstract base class these frameworks try to push out into XML files. Read more »

Where is the user interface? Comments Off

This is related to this quote by Michael Kay on the XML-DEV mailing list, on the subject of validation.

There is also scope for reasonableness checks to catch data input errors. But they belong as close to the user interface level as possible, not at the information management level.

Which is fine as far as it goes but the use of the term ‘user interface’ is misleading, I think, to most people (myself included) this implies ‘end user’ but this is not always the case. If, for example, you are writing a service (web- or otherwise) for external, or even only internal, use, then the ‘user interface’ is the service interface that you expose and it’s perfectly reasonable (in fact, I’d argue that it’s pretty much essential) to validate every message that your service receives.

JSR-173 Reference Implementation Comments Off

I’ve been looking at the new streaming API for XML (JSR-173), I’ve been generally impressed but have found a bug in the reference implementation, here’s the details, using this test program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import java.io.*;
import javax.xml.stream.*;
public class StaxWriterTest {
	static String nsURI = "http://ianp.org/nsURI";
	static String nsPrefix = "a";
	static int depth = 0; // Used to pretty print the output.
	static XMLStreamWriter w;
	public static void main(String[] args) {
		try {
			w.writeStartDocument();
			indent(1);
			w.writeStartElement(nsURI, "root");
			w.writeNamespace(nsPrefix, nsURI);
			indent(0);
			w.writeEmptyElement(nsURI, "levelOne");
			w.writeAttribute(nsURI, "foo", "foo");
			indent(0);
			w.writeStartElement(nsURI, "levelOne");
			w.writeEndElement();
			indent(1);
			w.writeStartElement(nsURI, "levelOne");
			indent(1);
			w.writeEmptyElement(nsURI, "levelTwo");
			w.writeAttribute(nsURI, "foo", "foo");
			indent(-2);
			w.writeEndElement();
			indent(0);
			w.writeStartElement(nsURI, "levelOne");
			w.writeEndElement();
			indent(-1);
			w.writeEndElement();
			w.flush();
			w.close();
		} catch (Exception e) {
			e.printStackTrace();
			System.exit(1);
		}
	}
	static void indent(int d) {
		try {
			if (d < 0) { depth += d; }
			for (int i = 0; i < depth; ++i)
			w.writeCharacters("  ");
			if (d > 0) { depth += d; }
		} catch (XMLStreamException e) {
			throw new RuntimeException(e);
		}
	}
}

This is using version 7 of the reference implementation, by the way. The program should produce this output:

1
2
3
4
5
6
7
8
9
&lt;?xml version='1.0' encoding='utf-8'?>
&lt;a:root xmlns:a="http://ianp.org/nsURI">
	&lt;a:levelOne a:foo="foo"/>
	&lt;a:levelOne/>&lt;/a:levelOne>
	&lt;a:levelOne>
		&lt;a:levelTwo a:foo="foo"/>
	&lt;/a:levelOne>
	&lt;a:levelOne/>&lt;/a:levelOne>
&lt;/a:root>

But actually produces this:

1
2
3
4
5
6
7
8
9
&lt;?xml version='1.0' encoding='utf-8'?>
&lt;a:root xmlns:a="http://ianp.org/nsURI">
	&lt;a:levelOne a:foo="foo"/>
	&lt;a:levelOne/>&lt;/a:levelOne>
	&lt;a:levelOne/>
		&lt;a:levelTwo a:foo="foo"/>
	&lt;/a:levelOne>
	&lt;a:levelOne/>&lt;/a:levelOne>
&lt;/a:root>

Line 5 is generated as an empty element instead of a start element. I’ve pointed this out the to JCP committee, we’ll see if they repsond.

Binary XML Encoding Comments Off

Miguel de Icaza writes about binary encoding for XML (Omri’s page was unavailable, so I can’t comment on that).

This is already in widespread usage, a la WBXML. It’s a pretty successful standard, given that it targets low bandwidth mobile phones they’ve obviously encoded for size, a fact made easier that it also targets a specific schema. I.e. it’s not a generic XML encoding.

Omri’s thesis is that there are multiple things that you might want to optimize for: size, parsing speed and overhead for generating the data and that it is not possible to define a file format that satisfies all of those different needs.

Well, I thought we had already done this with XML. The fact is, you can already use your own favourite encoding. If I want to make it easy to parse then I can use UTF32, if I want to be ‘more standard’ I can use UTF8. I’m not sure about this (I need to check the spec) but in theory you should be able to use any encoding that you want to, as long as all parties agree on it.