tecosystems

Everyone Gets a Database

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

Pendulum

Once upon a time, there were software categories called Application Performance Monitoring (APM) and Logging. They each involved the collection of large volumes of telemetry data, used among other things for the purpose of understanding and attacking problems at varying layers of the enterprise application stack.

As time passed and infrastructure grew more distributed and applications more complex, a new software category gradually emerged: observability. Its aim was to provide those charged with running applications and infrastructure a more nuanced, granular and integrated view of software problems that might be known or unknown, ephemeral and/or involve multiple layers of a given stack. This approach proving effective, the category attracted more attention, and consequently money – both investment and revenue – began to flow more freely.

Unsurprisingly, then, vendors in the APM and Logging categories concluded that this newly emerging adjacent market represented both a logical extension to their existing capabilities as well as a potentially lucrative growth opportunity. Rather than leave money on the table, many vendors in these spaces grew sideways into the observability market competing with native observability players.

This is, in its own way, both a case study in market consolidation as well as just another in a long line of such cases. As is, increasingly, the consolidation we see within the database and data platforms market.


Almost four years ago, it became apparent that the pendulum that had swung away from general purpose databases and towards an array of specialized datastores had reversed and was well into its return trajectory. As captivating as the idea and abilities of bespoke databases built for a singular purpose was, the reality of both developing to and continually operating multiple databases (as well as significantly expanding the vendor procurement footprint) had set in. Enterprises and developers alike, though for reasons that had little in common, increasingly advantaged databases that could handle multiple workloads through a single engine and interface.

In the years since, that directional shift has not slowed. If anything, it’s accelerated. Database consolidation continues apace, and single workload databases are increasingly the exception rather than the rule.

There is a larger question facing the data sector, however: where and how will data lakes(houses) and databases collide? Recent events are suggestive, but the history is contradictory.

Five years ago, MongoDB – born as a document database but having since added the ability to handle workloads well beyond that including search, stream and vector – announced a new set of capabilities including a data lake product. This was an early effort to begin to converge the database with large scale data stores underneath them. Two years after that, they followed up that announcement with refinements on both the object storage and analytical query fronts.

Large scale data storage and databases, it seemed, would follow the macro trend and converged towards a single interface with the added bonus that procurement would only have to deal with one vendor. The trajectory seemed clear.

Clear, except that last fall MongoDB announced that it was deprecating its data lake offering, and that it would be end-of-lifed within a year, or three months from today. The question then became whether MongoDB’s deprecated effort to merge database with data platform was the outlier, or the shape of things to come.

That answer won’t be evident for some time, but two notable acquisitions suggest that convergence may yet be on the way.

  • One month ago on May 14th, Databricks acquired the database company Neon – a serverless Postgres database vendor that was well thought of amongst friends of RedMonk. The acquisition cost was $1B.
  • Earlier this week, meanwhile, its biggest rival Snowflake agreed to acquire another Postgres database vendor, Crunchy Data, for $250M.

These two dueling is, of course, nothing new. See, for example, their competition over the DBRX (Databricks) and Arctic (Snowflake) models, or the tug of war over Tabular – ultimately acquired by Databricks for $2B.

Both companies obviously see a future in which AI plays a if not the critical role with respect to data, which is logical given that AI is built on and from a high volume of data and that AI advantages existing data incumbents for both trust and data gravity reasons. But then again every software category today is making enormous bets on AI.

It is notable, however, that both vendors just as clearly see traditional database capabilities – PostgreSQL capabilities in particular – as likely to become, if they are not already, table stakes. Convergence, put simply, is the goal.

The challenge for these data platforms, as it was for MongoDB when it launched its data lake product, however, is market permission. While it makes all the sense in the world on paper for data platforms and databases to come together, markets do not always follow what makes sense on paper and do not always embrace a new product in a market new to the vendor. Enterprises are cautious about investing in products offered by vendors whose fundamental DNA lies in an entirely distinct market with different expectations. And from the seller’s side of the equation, vendors need to learn how to go to market and sell to different users and different buyers with differing sets of concerns.

It is too early to say whether or not Databricks and Snowflake will be granted permission to compete directly in database markets, or that they have or will acquire the ability to do so efficiently. But they’re collectively betting a billion and a quarter dollars that MongoDB had the right of it back in 2020, not in 2024, and that the market wants data lakes and databases offered by the same, single supplier.

They’re making the same bet, in other words, that APM and Logging companies made when they implicitly argued that observability should be a feature of their existing product rather than a brand new market of its own.

Disclosure: Crunchy Data and MongoDB are RedMonk customers. Databricks and Snowflake are not currently RedMonk customers.