One of the more rewarding recent trends that I’ve been observing in the development space has been the acceptance, grudging though it may be at times, that different development challenges may in fact necessitate different tools and technologies. Not new tools, necessarily, but different tools. The most recent example of this is the IBM/Zend deal, but this recognition can also be seen driving other developments such as the overwhelming preference from developers for REST based web service development. The developers themselves are light years ahead of the vendors that support them in the realization of this truth, but give them time and I’m sure the technology purveyors will get there. The winners will, anyway.
But the larger issue, at least for this analyst, is this simple and often asked question: what’s next? What does this trend portend beyond the world of language preferences or web services approaches? Well, as far as I can tell, the next major area affected will be the data layer.
Now before we get there, let’s have a quick reality check. How many of you in reading the sentence above did the mental substitution of “relational database” for “data layer?” I know I probably would have, and I’m guessing most of you have as well. How could you not? Over the past few decades, persisting data for the applications that needed it meant a relational database. Sure, the names may have changed and vendors may have risen and fallen, and there was an exception or two such as Sleepycat’s Berkley DB, but pretty much universally when people talk data persistence they mean “relational database.” Ok, maybe not in a zSeries shop, but pretty much everywhere else.
This virtual tunnel vision is certainly not due to a shortage of other options; vendors like iPedo, Sleepycat, and Software AG have been around for years and have offered a different way to tackle certain development challenges whether that’s XML, non-relational storage, or integration. But as with Windows on the desktop, the inertia of relational data storage as the de facto standard has been remarkable.
But I’ve been hearing a bit more grumbling lately (due perhaps to the invasion of simplicity elsewhere in the dev lifecycle?) from the development community that relational databases may not in fact be the answer to every problem. Many of them, perhaps, but not every problem. What are some of the shortcomings? Well, I’ve been recommending that everyone take a look at what Google’s Adam Bosworth had to say in a blog entry here and a Gillmor Gang session here. It is startling, when you take a step back, at how difficult it is to use relational databases to solve non-trivial problems.
It may well be that the relational vendors can address some of the issues Bosworth and others have described, but I also see in the difficulties he cites opportunities for vendors with different approaches. I found, for example, the decision by the Mono team to ship DB4Objects with their package interesting not just because it augments the runtime and libraries with an out-of-the-box persistence layer, but also because that persistence layer is object oriented. Christof Wittig and the gang from DB4Objects may position this as a solution for prototyping, but as I told him when we spoke at LinuxWorld, anyone who’s developed for any length of time knows what happens to prototype datastores. More often than not, they become production datastores because if they work, who’s got time to replace them?
But the acceptance is merely a symptom of a larger trend; the willingness to look at RDBMS alternatives. After all, if object oriented databases can find acceptance, what of XML datastores? Might there be a sizable population of developers out there that would prefer to work with their XML natively, without shredding it and cramming it back into non-fitting and rigid tables? The answer’s clear from the conversations we’re having. For example: one of the roles we try to play as analysts with contacts both in the vendor community and the developer side is facilitation; technology match makers if you will. When we asked the folks behind one project if they’d like a peek at some XML technologies we’d been discussing with a vendor client of ours, the answer was an immediate yes. This reaction was neither a surprise nor a fluke: the interest is clearly real.
Whatever the technical basis, I think developers are beginning to loosen the relational shackles just a bit, to the point that they’re permitted to at least explore other technical options. Whether they’re XML, object-oriented or just really sophisticated filesystems is ultimately besides the point – it’s the fact they are actually meriting longer looks that matters. From a developer’s perspective, it’s about projects that can choose the technology that works best for their needs. Maybe that’s relational, but maybe it’s not. Been a while since many have had that choice, and it’ll be fascinating to see what people do with that freedom.
Dan Brackett says:
March 7, 2005 at 12:09 pm
I think that the distinction between wire formats and storage media will continue to blur. The Architecture Astronaut in me notes that only the time component differs between wire formats and storage, and sometimes then not that much — look at the unix tools tradition of using temp files for transient, potentially asynchronous, inter-tool communication. DB4Objects leans this way, too — "treat your storage the same way you'd treat anything else: as an object." Python's object-serialization mechanism feels very similar to me, too (check Bram Cohen's site, or just open a .torrent, for a pretty lucid demonstration of python dict serialization).
All that said, developers still need to be aware of the fact that different storage mechanisms have very different tradeoffs in terms of reliability and performance and indexability and so forth. Yes, choice is good — but choice needs to be educated, or we users will all suffer.
March 7, 2005 at 2:16 pm
depending on how you define wire formats, i either agree or disagree with you. there's no question, however, that distinctions between storage media and the data persistence mechanisms that sit on top of them have changed; one has only to look at your firms ZFS to know that.
i would probably take exception as well that only the time component differs; one thing we're seeing more of is efforts to prioritize based on payload information: giving voice packets a higher priority, for example. storage, on the other hand, has had these notions for a long time – and of late we've seen things like Centera and DR450 take that to the next level.
but on the last point, we're definitely in agreement: choice is definitely a welcome development, but brings with it substantial responsibility.
to put it another way: no one gets fired for going with a relational DB, but you might for picking something new. the trick, as always, is to understand the relative merits, strengths and weaknesses of individual platforms and to design accordingly.
Mike Champion says:
March 7, 2005 at 3:16 pm
"I think developers are beginning to loosen the relational shackles just a bit".
Hmmm … it seems to me (although I spent most of the last 5 years at Software AG so I might be biased) that developers loosened the relational shackles some time ago. (C. J. Date and the other cheery folks at dbdebunk dot com of course think developers never locked themselved in the shackles in the first place, but that's another matter). In any event, multivalued databases (e.g. Pick) and OO databases have been around for a long time, and XML DBMS are fairly mature after 5 years or so. Now the major DB vendors are introducing "native" XML capabilities, merging in full text capabilities, and so on. Are people looking for RDBMS alternatives, or expecting "databases" to handle all sorts of structured, unstructured, and semi-structured data within the same products, APIs, etc.?
stephen o'grady says:
March 7, 2005 at 4:58 pm
mike: my intention was not to imply that these technologies are new; if that's what came across, i messed up somewhere. as i mention, vendors like Software AG, Sleepycat et al have been around for years, and done ok for themselves. OO DB's have likewise been around for eons – though on another note one of the challenges, IMO, for a vendor like DB4Objects is to shed some of the baggage that comes along with that label. so what's not new is the technology, but the acceptance of that technology.
what it comes down to, IMO, is the typical enterprise conservativism. enterprises embrace new technologies slowly (if at all, in some cases), and the data layer (unlike the application layer, one could argue) has *always* been a strategic decision. that caution, however, is increasingly being tempered by a willingness to consider that in some situations, an RDBMS is perhaps not the only hammer for that nail. perhaps your experience at Software AG was different, but that – to us – is very different, because it could herald the entrance of non-relational (XML or otherwise) data stores as a mainstream option.
to answer your last question, it's not about products, it's about need. given that the relational market is fairly competitive, we have yet to speak to anyone who's considering alternatives for the sake of considering alternatives. instead, it's driven by needs that are not being satisfactorily met. whoever supplies that functionality will be considered – vendor, open source, or otherwise.