tecosystems

XML and Office Formats: There’s More to it Than Accessibility

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

While speaking with the folks from Scalix this afternoon, I happened to reiterate a point I’ve been discussing more and more of late: the programmability of the Open Document Format (ODF). The context for mentioning it was the fact that Scalix has the ability to dynamically transform certain types of documents – RTF in the example I saw – on the fly to other formats depending on how and where they are viewed.

It’s precisely that type of transform that’s being lost in the shuffle, in my opinion, in the fascination with the politics [1] behind the State of Massachusetts’ decision to mandate ODF as a standard. What’s brand new with the advent of the ODF and its Microsoft counterpart the Office Open XML Formats are the implications vis a vis the ability to manipulate documents programmatically. With the advent of these formats, we could see the advent of far more intelligent messaging, workflow, etc systems because they’ll have the ability to deconstruct these documents at very granular levels.

To many of you, no doubt, that’s just a bunch of meaningless technical babble, so here’s an example that may make it more real: you know how organizations have been bitten in the past by sending out documents that, unbeknownst to them, included all sorts of unflattering or embarrassing comments and markup? With the previous generation of binary office formats, there wasn’t too much you could do because the documents were opaque to any application besides the ones that produced it. The new XML based formats, on the other hand, are essentially zip containers with a bunch of XML components in them. What if before you emailed someone external to the system, the messaging server could deconstruct the document into various pieces, remove any comments or markup, then reconstruct the document without touching the content? That’s what the kind of document manipulation made possible by the transition from binary to XML formats.

That’s just one example, of course; there are likely hundreds if not thousands of other such scenarios buried within corporate workflow and routing procedures. This is, in part, what ODF advocates such as Gary Edwards and Sam Hiser mean when they talk about the format as a “Universal Transformation Layer.” What’s been interesting, however, is that in the probably half a dozen conversations I’ve had with messaging/worklow/etc providers over the past couple of weeks, I’ve heard a lot of enthusiasm for the ODF itself, but little in the way of plans – NDA or otherwise – to fully leverage the XML nature of the format.

Sooner rather than later, however, I expect one or more of the messaging/workflow/etc providers to actively embrace the ODF – and it would be foolish to believe that Microsoft doesn’t have plans in this regard for its MSXML format. The interesting question for many vendors will be – how might the new XML formats overlap – or not – with electronic forms?

Either way, we should expect to see a lot more pieces like this. While the promised longevity of XML based documents may be exciting to governments and libraries, developers are likely to be far more interested in what the formats allow them to do now that they couldn’t do before than whether or not the document’s will be readable in 50 years. Fortunately for them, the implications of the new formats are profound.

[1] The term is used here in both the figurative and literal senses. One commenter I spoke with the other day suggested that the ODF might play a role in the much anticipated bid for the Presidency by current MA Governor Mitt Romney.