One of the fascinating aspects of the technology business, to me, is the degree to which – for better or worse – it represents a microcosm of the larger non-technology world. From politics to religion, it’s all here. If you’re looking for proof, I’d look no further than the current scuffle around the Open Document Format and OpenOffice.org which is as much about such abstract concepts as control and sovereignty as it is about technology.
The latest example of this is a piece from ZDNet’s George Ou entitled “OpenOffice.org 2.0 is here, but is it a pig?” IMO, this entry demonstrates amply the natural human tendency to believe what we want to believe, and select our evidence accordingly.
It’s not that I fully disagree with Ou; I actually semi-defended his test results in an earlier entry here, and I will readily concede that Microsoft Office is a better performing piece of software than its open source counterpart, OpenOffice.org (OO.o). I use OO.o every day, and Microsoft Office perhaps a few times a week, and even on older hardware Microsoft’s package is snappier. But the question is: how much?
Apparently not content with the widely accepted conclusion that Microsoft Office is faster than OO.o, Ou seems fixated on proving that OO.o is in fact so slow as to be not usable. What gets lost in his analysis however are simple questions such as: usable for whom? and for which documents?
To ‘prove’ that OO.o is a pig, Ou offers up a sample file here for users to perform their own tests. So far, so good. But once you get the file down, you may notice something a little bit odd: the file is 3.6 MB’s in size. That’s larger than just about any spreadsheet I’ve seen, but then go one step further and unzip the file, and you’ll discover that the content.xml portion of his sample file explodes to 279.5 MB’s. That seemed a bit ridiculous to me, but I’ll set that aside for the moment. I then opened the file as he requested, and on my hardware which runs at a third of the speed of his test bed I can confirm that OO.o took just a couple of ticks short of forever to open the file, and my hard drive and CPU were seriously winded when they reached the finish line.
But the question here is: has Ou proven that OO.o is unusable, or that OO.o is unusable for a type of file that a very small percentage of users is ever likely to see? I could claim that Tomcat is unusable because it doesn’t scale like WebSphere XD, and while the latter point may be true it certainly doesn’t mean the former is.
As is probably obvious, I’m of the opinion that Ou, by choosing the example he did, has proven nothing with respect to general usage of OO.o. I just scanned my local directories here, for example, and the single largest spreadsheet I discovered – apart from the test file – is just shy of 1 MB. How does OO.o perform in that situation? The file opens in maybe 30 seconds or so; well within the realm of acceptable performance for this user.
I’m certainly not the only one to quarrel with Ou’s approach here – his talkbacks (ZDNet-speak for comments) are rife with people taking him to task for the size of the sample file, and the discussion has generated 230 comments so far and shows few signs of slowing down. He responded to one of those questioning his choice of a sample file by saying this:
I analyze perfmon log files that are very large all the time. OO.o would be too slow on these files. Ive had users with very large files, maybe not 200 MB but 80 MB. Maybe it only takes 1 minute to open, but thats unacceptable.
Ou’s contention then is that he selected such a large file because he believes that they’re relatively common; I personally don’t buy that. I believe rather that he picked it to make his original contention stronger; in his September 14th article, Ou made the following point:
Most computer users would pay a huge premium for hardware that is 2 times faster, so why would they want software that was up to 98 times slower?
98% slower does indeed sound horrifying, but practically speaking it’s only noticeable when you blow file sizes up to the couple of hundred meg range he’s talking about. When you’re dealing with the average spreadsheet that you or I might shoot around, the difference simply isn’t that significant – in other words, 98% of a time period measured in minutes is far more noticeable than 98% of one measured in seconds. Hence the near 300 MB file.
Interestingly, it’s not even that noticeable for more complex spreadsheets. It’s common knowledge that the folks that push their spreadsheets hardest are the finance geeks, so I took the time to email my brother this morning requesting a couple of reasonably complex spreadsheets for testing purposes. My brother, as a backgrounder, is I-banking trained in his usage of Excel  and currently employed by a relatively well known hedge fund. The files he sent over were financial models of two public companies, and essentially reflect the financial well being of the institutions in question in spreadsheet form. One spreadsheet has 7 sheets and the other 12, and the sizes? 188.5 kb and 294 kb, respectively. Both opened quickly in my copy of OO.o 2.0, and even with the macro warnings rendered properly and in maybe a 20-30 second timeframe. I asked him how often he dealt with huge Excel files of the size that Ou featured, and his response was that they were very much the exception to the rule – and his is the profession that conventional wisdom at least says are the true power users of the format.
Now that’s just one example, of course, and given that I can’t distribute the samples you can – as always – choose to believe it or not believe it. But I have to question any general analysis of a product which relies on something so far outside the norm of what your average user can be expected to use, and at least consider the fact that there’s some human nature at work. Is OO.o slower? Yup. Is Ou likely correct that OO.o is poor to quite poor at editing the document sizes he’s discussing? Certainly seems that way. But is OO.o that much slower for regular, average office software users? Not in my experience. Ou’s chosen example seems to me to be the technological equivalent of proof-texting, and says at least as much about the analysis as it does about OO.o.
 A skillset I most recently employed while house hunting, where he built me a very nice model for assets, liabilities, and income/expedenditures.