One of the interesting byproducts of last week’s Microsoft Office Open XML Formats announcement has been the ensuing discussion on the definitions of base terms such as open, standard and format. Given the volume of email and comments I’ve received on these topics, I thought it would be useful to clarify exactly what I mean when I use those respective terms.
Let’s start with open. Several of the emails I’ve received have taken me to task for using the term “open” to describe Microsoft’s new formats. A bunch of the IBMers, for example, seem inclined to define open as not just publically available, but industry-wide and multi-vendor in nature. See Bob Sutor, Bobby Wolf Woolf, or Tom Glover. I don’t agree with that notion; open to me – in the context of discussions around formats for interchange – connotates nothing more than a fully documented specification that can be observed and designed to by external parties with full fidelity, preferably under royalty free terms. Given this, I have no difficulty describing the new Office formats as open, just as I’d use the same term to describe Adobe’s PDF. Are the new Office Open XML Formats open in the same way that open source is open? Hardly, but given the intrinsic differences between document formats and source code, I think it’s inevitable that the word open will have different connotations in the different contexts.
Of course, as the IBMers and others have pointed out, the term open is currently being applied to efforts which are not participatory by nature. But if we’re looking for a label to distinguish multi-vendor backed efforts from those controlled by a single party, I’d argue that we already such a term: standard. A standard is just what it implies: something multiple (even competing) parties can agree on, and implement independent of one another. We have numerous standards bodies that exist solely to govern these sorts of multi-vendor efforts: ISO, OASIS, the W3C, etc. Unfortunately, there are standards that not everyone agrees on, but is compelled to comply with because of dominant marketshare. These are referred to as de facto standards, and the Microsoft Word binary format is one example of this. As long as we can all agree to recognize the difference between de facto standards and honest standards, however, this should not be an insurmountable problem.
But either way let’s go back to our definition of open, defined (simply) as publically available and documented, then apply that to our definition of standard as a vendor neutral, participatory body. The end result? An open standard. Does the Open Document Format qualify for this term? Certainly. How about the Office Open XML Formats? No, it fails as a standard. It’s still by my definition an open format, but it’s not an open standard. If I was an enterprise buyer with particular concerns about interoperability, total cost of ownership or longevity, what would I choose? An open standard. Does that mean that the Office XML format isn’t open? I don’t believe so.
Anyhow, I know many out there – the IBMers in particular – won’t agree with this classification, but I wanted everyone following the discussions in this space (I highly recommend checking out the comment trails here or here) to understand why and when I use the terms open, standard and format so that at least the context for such claims is understood. Feel free to poke holes in my definitions as well; I’m willing to be persuaded if I’m wrong on this score.
Update: Bill’s proposing an interesting compromise here. Being halfway between my take (inclusion is part of the standard term) and some others (open requires inclusion), it’s definitely worth a look. I’m don’t know how well two dimensional arguments like Bill’s map to terminology, but it’s a start.