Friday, April 17, 2009

.

On XML

In his keynote talk at IDtrust2009 on Tuesday, Dan Blum made this comment about OpenID:

One reason it’s popular is that it’s pretty easy to implement — it’s not even XML.

XML is perhaps one of the most misunderstood pieces of the standards repertoire.

On the one hand, there’s an assumption that “it’s in XML” means that it’s somehow “standard”. What’s true here is that the raw data format is standard, so a standard parser can be used to pull the semantic bits out of the data. But understanding the semantics of those bit and making sense out of them requires some other definition on top of that. Without, say, a defined standard schema, along with a document that explains how to use the information that you can extract, the XML is useless.

On the other hand, I commonly hear that “XML is slow,” “XML is too complicated,” and, related to Dan’s comment, “XML is hard to use.”

It doesn’t have to be any of those. There are reasonably efficient XML parsers available. You can write your own, of course, if you think you can do better. But the beauty of using a standard language is that you don’t have to. Instead, you can spend your effort writing code that deals with the semantics — the code that actually does the work. To my mind, that makes it easier to use than custom languages, not harder.

It’s being “too complicated,” though, well, that’s arguable. It certainly does look complicated to the eye. On the other hand, it’s actually quite simply defined, and, while an XML document can be a lot to look at, it is readily readable by a human, not just by a computer (in contrast with, say, ASN.1).

What’s also true about XML is that it’s often presented as the solution to every problem. The over-exposure that causes puts some people off; some object to any proposal that a new thing be done in XML... opposing some others, who propose XML for everything.

XML isn’t for everything, but it’s great for some things... and it’s not “hard to use.”

1 comment:

Thomas J. Brown said...

As a developer, one of the things I like about (well-formed) XML is that the data is presented in a highly-structured and predictable way.

For example, when I was building a ski report a few months ago, I was able to make certain assumptions about what kinds of information were being stored where. This made the ski report significantly easier to build and provided a richer experience for users.

I think we (developers) like XML – or at least like the idea of it – so much because of that sense of predictability. Of course, that makes all kinds of assumptions about the validation being done to the data going into the XML to begin with...