Blogs

RedMonk

Skip to content

The Scale Imperative

The Computing Scale Co

Once upon a time, the larger the workload, the larger the machine you would use to service it. Companies from IBM to Sun supplied enormous hardware packages to customers with similarly outsized workloads. IBM, in fact, still generates substantial revenue from its mainframe hardware business. One of the under-appreciated aspects of Sun’s demise, on the other hand, was that it had nothing to do with a failure of its open source strategy; the company’s fate was sealed instead by the collapse in sales of its E10K line, due in part to the financial crisis. For vendors and customers alike, mainframe-class hardware was the epitome of computational power.

With the rise of the internet, however, this model proved less than scalable. Companies founded in the late 1990’s like Google, whose mission was to index the entire internet, looked at the numbers and correctly concluded that the economics of that mission on a scale-up model were untenable. With scale-up an effective dead end, the remaining option was to scale-out. Instead of big machines, scale-out players would build software that turned lots of small machines into bigger machines, e pluribus unum writ in hardware. By harnessing the collective power of large numbers of low cost, comparatively low power commodity boxes the scale-out pioneers could scale to workloads of previously unimagined size.

This model was so successful, in fact, that over time it came to displace scale-up as the default. Today, the overwhelming majority of companies scaling their compute requirements are following in Amazon, Facebook and Google’s footprints and choosing to scale-out. Whether they’re assembling their own low cost commodity infrastructure or out-sourcing that task to public cloud suppliers, infrastructure today is distributed by default.

For all of the benefits of this approach, however, the power afforded by scale-out did not come without a cost. The power of distributed systems mandates fundamental changes in the way that infrastructure is designed, built and leveraged.

Sharing the Collective Burden of Software

The most basic illustration of the cost of scale-out is the software designed to run on it. As Joe Gregorio articulated seven years ago:

The problem with current data storage systems, with rare exception, is that they are all “one box native” applications, i.e. from a world where N = 1. From Berkeley DB to MySQL, they were all designed initially to sit on one box. Even after several years of dealing with MegaData you still see painful stories like what the YouTube guys went through as they scaled up. All of this stems from an N = 1 mentality.

Anything designed prior to the distributed system default, then, had to be retrofit – if possible – to not just run across multiple machines instead of a single node, but to run well and take advantage of their collective resources. In many cases, it proved simpler to simply start from scratch. The Google Filesystem and HDFS papers that resulted in Hadoop are one example of this; at its core, the first iterations of the project were designed to deconstruct a given task into multiple component tasks to be more easily executed by an array of machines.

From the macro-perspective, besides the inherent computer science challenges of (re)writing software for distributed, scale-out systems – which is exceptionally difficult – the economics were problematic. With so many businesses moving to this model in a relatively short span of time, a great deal of software needed to get written quickly.

Because no single player could bear the entire financial burden, it became necessary to amortize the costs across an industry. Most of the infrastructure we take for granted today, then was developed as open source. Linux became an increasingly popular operating system choice as both host and guest; the project, according to Ohloh, is the product of over 5500 person-years in development. To put that number into context, if you could somehow find and hire 1,000 people high quality kernel engineers, and they worked 40 hours a week with two weeks vacation, it would take you 24 years to match that effort. Even Hadoop, a project that hasn’t had its 10 year anniversary yet, has seen 430 person-years committed. The even younger OpenStack, a very precocious four years old, has seen an industry conglomerate collectively contribute 594 years of effort to get the project to where it is today.

Any one of these projects could be singularly created by a given entity; indeed, this is common, in fact. Just in the database space, whether it’s Amazon with DynamoDB, Facebook with Cassandra or Google with BigQuery, each scale-out player has the ability to generate its own software. But this is only possible because they are able to build upon the available and growing foundation of open source projects, where the collective burden of software is shared. Without these pooled investments and resources, each player would have to either build or purchase at a premium everything from the bare metal up.

Scale-out, in other words, requires open source to survive.

Relentless Economies of Scale

In stark contrast to the difficulty of writing software for distributed systems, microeconomic principles love them. The economies of scale that larger players can bring to bear on the markets they target are, quite frankly, daunting. Their variable costs decrease due to their ability to purchase in larger quantities; their fixed costs are amortized over a higher volume customer base; their relative efficiency can increase as scale drives automation and improved processes; their ability to attract and retain talent increases in proportion to the difficulty of the technical challenges imposed; and so on.

If it’s difficult to quantify these advantages in precise terms, but we can at least attempt to measure the scale at which various parties are investing. Specifically, we can examine their reported plant, property and equipment investments.

If one accepts the hypothesis that economies of scale will play a significant role in determining who is competitive and who is not, this chart suggests that the number of competitive players in the cloud market will not be large. Consider that Facebook, for all of its heft and resources, is a distant fourth in terms of its infrastructure investments. This remains true, importantly, even if their spend was adjusted upwards to offset the reported savings from their Open Compute program.

Much as in the consumer electronics world, then, where Apple and Samsung are able to leverage substantial economies of scale in their mobile device production – an enormous factor in Apple’s ability to extract outsized and unmatched margins – so too is the market for scale-out likely to be dominated by the players that can realize the benefits of their scale most efficiently.

The Return of Vertical Integration

Pre-internet, the economics of designing your own hardware were less than compelling. In the absence of a global worldwide network, not to mention less connected populations, even the largest companies were content to outsource the majority of their technology business, and particularly hardware, to specialized suppliers. Scale, however, challenges those economics on a fundamental level, and forced those at the bleeding edge to rethink traditional infrastructure design, questioning all prior assumptions.

It’s long been known, for example, that Google eschewed purchasing hardware from traditional suppliers like Dell, HP or IBM in favor of its own designs manufactured by original device manufacturers (ODMs); Stephen Shankland had an in depth look at one of their internal designs in 2009. Even then, the implications of scale are apparent; it seems odd, for example, to embed batteries in the server design, but at scale, the design is “much cheaper than huge centralized UPS,” according to Ben Jai. But servers were only the beginning.

As it turns out, networking at scale is an even greater challenge than compute. On November 14th, Facebook provided details on its next generation data center network. According to the company:

The amount of traffic from Facebook to Internet – we call it “machine to user” traffic – is large and ever increasing, as more people connect and as we create new products and services. However, this type of traffic is only the tip of the iceberg. What happens inside the Facebook data centers – “machine to machine” traffic – is several orders of magnitude larger than what goes out to the Internet…

We are constantly optimizing internal application efficiency, but nonetheless the rate of our machine-to-machine traffic growth remains exponential, and the volume has been doubling at an interval of less than a year.

As of October 2013, Facebook was reporting 1.19B active monthly users. Since that time, then, machine to machine east/west networking traffic has more than doubled. Which makes it easy to understand how the company might feel compelled to reconsider traditional networking approaches, even if it means starting effectively from scratch.

Earlier that week at its re:Invent conference, meanwhile, Amazon went even further, offering an unprecedented peek behind the curtain. According to James Hamilton, Amazon’s Chief Architect, there are very few remaining aspects to AWS which are not designed internally. The company has obviously dramatically grown the software capabilities of its platform over time: on top of basic storage and compute, Amazon has integrated an enormous variety of previously distinct services: relational databases, a Map Reduce engine, data warehousing and analytical capabilities, DNS and routing, CDN, a key value store, a streaming platform – and most recently ALM tooling, a container service and a real-time service platform.

But the tendency of software platforms to absorb popular features is not atypical. What is much less common is the depth to which Amazon has embraced hardware design.

  • Amazon now builds their own networking gear running their own protocol. The company claims their gear is lower cost, faster and that the cycle time for bugs is reduced from months to weekly.
  • Amazon’s server and storage designs are custom to the vendor; the storage servers, for example, are optimized for density and pack in 864 disks at a weight of almost 2400 pounds.
  • Intel is now working directly with Amazon to produce custom chip designs, capable of bursting to much higher clock speeds temporarily.
  • To ensure adequate power for its datacenters, Amazon has progressed beyond simple negotiated agreements with power suppliers to building out custom substations, driven by custom switchgear the company itself designed.

Compute, networking, storage, power: where does this internal innovation path end? In Hamilton’s words, there is no category of hardware that is off-limits for the company. But the relentless in-sourcing is not driven by religious objections – such considerations are strictly functions of cost.

In economic terms, of course, this is an approximation of backward vertical integration. Amazon may not own the manufacturers themselves as in traditional vertical integration, but manufacturing is an afterthought next to the original design. By creating their own infrastructure from scratch, they avoid paying an innovation tax to third party manufacturers, can build strictly to their specifications and need only account for their own needs – not the requirements of every other potential vendor customer. The result is hardware that is, in theory at least, more performant, better suited to AWS requirements and lower cost.

While Amazon or Facebook have provided us with the most specifics, then, it’s safe to assume that vertical integration is a pattern that is already widespread amongst larger players and will only become more so.

The Net

For those without hardware or platform ambitions, the current technical direction is promising. With economies of scale growing ever larger and gradual reduction of third party suppliers continuing, cloud platform providers would appear to have margin yet to trim. And at least to date, competition on cloud platforms (IaaS, at least) has been sufficient to keep vendors from pocketing the difference, with industry pricing still on a downward trajectory. Cloud’s pricing advantage historically was the ability to pay less upfront and more over the longer term, but with base prices down close to 100% over a two year period, the longer term premium attached to cloud may gradually decline to the point of irrelevance.

On the software front, an enormous portfolio of high quality, highly valuable software that would have been financially out of the reach of small and even mid-sized firms even a few years ago is available today at no cost. Virtually any category of infrastructure software today – from the virtualization layer to the OS to the runtime to the database to the cloud middleware equivalents – has high quality, open source options available. And for those willing to pay a premium to outsource the operational responsibilities of building, deploying and maintaining this open source infrastructure, any number of third party platform providers would be more than happy to take those dollars.

For startups and other non-platform players, then, the combination of hardware costs amortized by scale and software costs distributed across a multitude of third parties means that effort can be directed towards business problems rather than basic, operational infrastructure.

The cloud platform players, meanwhile, symbiotically benefit from these transactions, in that each startup, government or business that chooses their platform means both additional revenue and a gain in scale that directly, if incrementally, drives down their costs (economies of scale) and indirectly increases their incentive and ability to reduce their own costs via vertical integration. The virtuous cycle of more customers leading to more scale leading to lower costs leading to lower prices leading to more customers is difficult to disrupt. This is in part why companies like Amazon or Salesforce are more than willing to trade profits for growth; scale may not be a zero sum game, but growth today will be easier to purchase than growth tomorrow – yet another reason to fear Amazon.

The most troubling implications of scale, meanwhile, are for traditional hardware suppliers (compute/storage/networking) and would-be cloud platform service providers. The former, obviously are substantially challenged by the ongoing insourcing of hardware design. Compute may have been first, with Dell being forced to go private, HP struggling with its x86 business and IBM being forced to exit the commodity server business entirely. But it certainly won’t be the last. Networking and storage players alike are or should be preparing for the same disruption server manufacturers have experienced. The problem is not that cloud providers will absorb all or even the majority of the networking and storage addressable markets; the problem is that it will absorb enough to negatively impact the scale traditional suppliers can operate at.

Those that would compete with Amazon, Google, Microsoft et al, meanwhile, or even HP or IBM’s offerings in the space, will find themselves faced with increasingly higher costs relative to larger competition, whether it’s from premiums paid to various hardware suppliers, lower relative purchasing power or both. Which implies several things. First, that such businesses must differentiate themselves quickly and clearly, offering something larger, more cost-competitive players are either unable or unwilling to. Second, that their addressable market as a result of this specialization will be a fraction of the overall opportunity. And third, that the pool of competitors for base level cloud platform services will be relatively small.

What the long term future holds should these predictions hold up and the market come to be dominated by a few larger players is less clear, because as ever in this industry, their disruptors are probably already making plans in a garage somewhere.

Disclosure: Amazon, Dell, HP, IBM and Microsoft are RedMonk clients. Facebook and Google are not.

Categories: Cloud.

The Implications of IaaS Pricing Patterns and Trends

With Amazon’s re:Invent conference a week behind us and any potential price cuts or responses presumably implemented by this point, it’s time to revisit the question of infrastructure as a service pricing. Given what’s at stake in the cloud market, competition amongst providers continues to be fierce, driving costs for customers ever lower in what some observers have negatively characterized as a race to the bottom.

While the downward pricing pressure is welcome, however, it can be difficult to properly assess how competitive individual providers are with one another, all the more so because their non-standardized packaging makes it effectively impossible to compare service to service on an equal footing.

To this end we offer the following deconstruction of IaaS cloud pricing models. As a reminder, this analysis is intended not as a literal expression of cost per service; this is not, in other words, an attempt to estimate the actual component costs for compute, disk, and memory per provider. Such numbers would be speculative and unreliable, relying as they would on non-public information, but also of limited utility for users. Instead, this analysis compares base hourly instance costs against the individual service offerings. What this attempts to highlight is how providers may be differentiating from each other – deliberately or otherwise – by offering more memory per dollar spent, as one example. In other words, it’s an attempt to answer the question: for a given hourly cost, who’s offering the most compute, disk or memory?

As with previous iterations, a link to the aggregated dataset is provided below, both for fact checking and to enable others to perform their own analyses, expand the scope of surveyed providers or both.

Before we continue, a few notes.

Assumptions

  • No special pricing programs (beta, etc)
  • Linux operating system, no OS premium
  • Charts are based on price per hour costs (i.e. no reserved instances)
  • Standard packages only considered (i.e. no high memory, etc)
  • Where not otherwise specified, the number of virtual cores is assumed to equal to available compute units

Objections & Responses

  • This isn’t an apples to apples comparison“: This is true. The providers do not make that possible.
  • These are list prices – many customers don’t pay list prices“: This is also true. Many customers do, however. But in general, take this for what it’s worth as an evaluation of posted list prices.
  • This does not take bandwidth and other costs into account“: Correct, this analysis is server only – no bandwidth or storage costs are included. Those will be examined in a future update.
  • This survey doesn’t include [provider X]“: The link to the dataset is below. You are encouraged to fork it.

Other Notes

  • HP’s 4XL (60 cores) and 8XL (103 cores) instances were omitted from this survey intentionally for being twice as large and better than three times as large, respectively, as the next largest instances. While we can’t compare apples to apples, those instances were considered outliers in this sample. Feel free to add them back and re-run using the dataset below.
  • While we’ve had numerous requests to add providers, and will undoubtedly add some in future, the original dataset – with the above exception – has been maintained for the sake of comparison.

How to Read the Charts

  • There was some confusion last time concerning the charts and how they should be read. The simplest explanation is that the steeper the slope, the better the pricing from a user perspective. The more quickly cores, disk and memory are added relative to cost, the less a user has to pay for a given asset.

With that, here is the chart depicting the cost of disk space relative to the price per hour.

(click to embiggen)

This chart is notable primarily for two trends: first, the aggressive top line Amazon result and second, the Joyent outperformance. The latter is an understandable pricing decision: given Joyent’s recent market focus on data related workloads and tooling, e.g. the recently open sourced Manta, Joyent’s discounting of storage costs is logical. Amazon’s divergent pattern here can be understood as two separate product lines. The upper points represent traditional disk based storage (m1), which Amazon prices aggressively relative to the market, while the bottom line represents its m3 or SSD based product line, which is more costly – although still less pricy than alternative packages from IBM and Microsoft. Google does not list storage in its base pricing and is thus omitted here.

The above notwithstanding, a look at the storage costs on a per provider basis would indicate that for many if not most providers, storage is not a primary focus, at least from a differentiation standpoint.

(click to embiggen)

As has historically been the case, the correlation between providers in the context of memory per dollar is high. Google and Digital Ocean are most aggressive with their memory pricing, offering slightly more memory per dollar spent than Amazon. Joyent follows closely after Amazon, and then comes Microsoft, HP and IBM in varying order.

Interestingly, when asked at the Google Cloud Live Platform event whether the company had deliberately turned the dial in favor of cheaper memory pricing for their offerings as a means of differentiation and developer recruitment, the answer was no. According to Google, any specific or distinct improvements on a per category basis – memory, compute, etc – are arbitrary, as the company seeks to lower the overall cost of their offering based on improved efficiencies, economies of scale and so on rather than deliberately targeting areas developers might prioritize in their own application development process.

Whatever their origin, however, developers looking to maximize their memory footprint per dollar spent may be interested in the above as a guide towards separating services from one another.

(click to embiggen)

In terms of computing units per dollar, Google has made progress since the last iteration of this analysis, where it was a bottom third performer. Today, the company enjoys a narrow lead over Amazon, followed closely by HP and Digital Ocean. IBM, Joyent and Microsoft, meanwhile, round out the offerings here.

It is interesting to note the wider distribution within computing units versus memory, as one example. Where there is comparatively minimal separation between providers with regard to memory per dollar, there are relatively substantive deltas between providers in terms of computing power per package. It isn’t clear that this has any material impact on selection or buying preferences at present, but for compute intensive workloads in particular it is at least worth investigating.

IaaS Price History and Implications

Besides taking apart the base infrastructure pricing on a component basis, one common area of inquiry is how provider prices have changed over time. It is enormously difficult to capture changes across services on a comparative basis over time, for many of the reasons mentioned above.

That being said, as many have inquired on the subject, below is a rough depiction of the pricing trends on a provider by provider basis. In addition to the caveats at the top of this piece, it is necessary to note that the below chart attempts to track only services that have been offered from the initial snapshot moving forward so as to be as consistent as possible. Larger instances recently introduced are not included, therefore, and other recent additions such as Amazon’s m3 SSD-backed package are likewise omitted.

Just as importantly, services cannot be reasonably compared to one another here because their available packages and the attached pricing vary widely; some services included more performant, higher cost offerings initially, and others did not. Comparing the average prices of one to another, therefore, is a futile exercise.

The point of the following chart is instead to try and understand price changes on a per provider basis over time. Nothing more, and nothing less.

(click to embiggen)

Unsurprisingly, the overall trajectory for nearly all providers is down. And the exception – Microsoft – appears to spike only because its base offerings today are far more robust than their historical equivalents. The average price drop for the base level services included in this survey from the initial 2012 snapshot to today was 95%: what might have cost $0.35 – $0.70 an hour in 2012 is more likely to cost $0.10 – $0.30 today. Which raises many qustions, the most common of which is to what degree the above general trend is sustainable: is this a race to a bottom, or are we nearing a pricing floor?

While we are far from having a definitive answer on the subject, early signs point to the latter. In the week preceding Amazon’s re:Invent, Google announced across the board price cuts to varying services, on top of an October 10% price cut. A week later, the fact that Amazon did not feel compelled to respond was the subject of much conversation.

One interpretation of this lack of urgency is that it’s simply a function of Amazon’s dominant role in the market. And to be sure, Amazon is in its own class from an adoption standpoint. The company’s frantic pace of releases, however – 280 in 2013, on pace for 500 this year – suggests a longer term play. The above charts describe pricing trends in one of the most basic elements of cloud infrastructure: compute. They suggest that at present, Amazon is content to be competitive – but is not intent on being the lowest cost supplier.

By keeping pricing low enough to prevent it from being a real impediment to adoption, while growing its service portfolio at a rapid pace, Amazon is able to get customers in the door with minimal friction and upsell them on services that are both much less price sensitive than base infrastructure as well as being stickier. In other words, instead of a race to the bottom, the points of price differentiation articulated by the above charts may be less relevant over time, as costs approach true commodity levels – a de facto floor – and customer attention begins to turn to time savings (higher end services) over capital savings (low prices) as a means of cost reduction.

If this hypothesis is correct, Amazon’s price per category should fall back towards the middle ground over time. If Amazon keeps pace, however, it may very well be a race to the bottom. Either way, it should show up in the charts here.

Disclosure: Amazon, HP, IBM, Microsoft and Rackspace are RedMonk customers. Digital Ocean, Google and Joyent are not.

Link: Here is a link to the dataset used in the above analysis.

Categories: Cloud.

What are the Most Popular Open Source Licenses Today?

For a variety of reasons, not least of which is that fewer people seem to care anymore, it’s been some time since we looked at the popularity of open source licenses. Once one of the more common inquiries we fielded, questions about the relative merits or distribution of licenses have faded as we see both consolidation around choices and increased understanding of the practical implications of various licensing styles. Given the recent affinity for permissive licensing, however, amongst major open source projects such as Cloud Foundry, Docker, Hadoop, Node.js or OpenStack, it’s worth revisiting the question of license choices.

Before we get into the question of how licensing choices have changed, it’s necessary to establish a baseline number for distribution today. While it cannot be considered definitive, Black Duck’s visibility into a wide variety of open source repositories and forges serves as a useful sample size. Based on the Black Duck data, then, the following chart depicts the distribution of usage amongst the ten most popular open source licenses.

(click to embiggen)

Moving left to right, from less popular licenses to the most popular, it is easy to determine the overall winner. As has historically been the case, the free software, copyleft GPLv2 is the most popular license choice according to Black Duck. Besides high profile projects such as Linux or MySQL, the GPL has been the overwhemingly most selected license for years. The last time we examined the Black Duck data in 2012, in fact, the GPL was more popular than the MIT, Artistic, BSD, Apache, MPL and EPL put together.

Popular as the GPL remains, however, it no longer enjoys that kind of advantage. If we group both versions (2 and 3) of the GPL together, the GPL is in use within 37% of the Black Duck surveyed projects. The three primary permissive license choices (Apache/BSD/MIT), on the other hand, collectively are employed by 42%. They represent, in fact, three of the five most popular licenses in use today.

License selection has clearly changed, then, but by how much? For comparison’s sake, here’s a chart of the percent change in license usage from this month’s snapshot of Black Duck’s data versus one from 2009.

(click to embiggen)

As we can see, the biggest loser in terms of share was the GPLv2 and, to a lesser extent, the LGPLv2.1. The decline in usage of the GPLv2 can to some degree be attributed to copyleft license fans choosing instead the updated GPLv3; that license, released in 2007, gained about 6% share from 2009 to 2014. But with usage of version 2 down by about 24%, the update is clearly not the only reason for decreased usage of the GPL.

Instead, the largest single contributing factor to the decline of the GPL’s dominance – it’s worth reiterating, however, that it remains the most popular surveyed license – is the rise of permissive licenses. The two biggest gainers on the above chart, the Apache and MIT licenses, were collectively up 27%. With the BSD license up 1%, the three most popular permissive licenses are collectively up nearly 30% in the aggregate.

While this shift will surprise some, and suggests that much like the high profile of projects like Linux and MySQL led to wider adoption of reciprocal or copyleft-style licenses, Hadoop and others are leaving a sea of permissively licensed projects in their wake.

But the truth is that a correction of some sort was likely inevitable. The heavily skewed distribution towards copyleft licenses was always somewhat unnatural, and therefore less than sustainable over time. What will be interesting to observe moving forward is whether these trends continue, or whether further corrections are in store. Currently, license preferences seem to be accumulating at either ends of the licensing spectrum (reciprocal or permissive); the middle ground in file-based licenses such as the LGPL/MPL remain a relatively distant third category in popularity. Will MPL-licensed projects like the recently opened Manta or SmartDataCenter change that, or are they outliers?

Whatever the outcome, it’s clear we should expect greater diversity amongst licensing choices than we’ve seen in the past. The days of having a single dominant license are, for all practical purposes, over.

Disclosure: Black Duck, the source of this data, has been a RedMonk client but is not currently.

Categories: licensing, Open Source.

Model vs Execution

One of the things that we forget today about SaaS is that we tried it before, and it failed. Coined sometime in 1999 if Google is to be believed, the term “Application Service Provider” (ASP) was applied to companies that delivered software and services over a network connection – what we today commonly call SaaS. By and large this market failed to gain significant traction. Accounts differ as to how and when a) SaaS was coined (IT Toolbox claims it was coined in 2005 by John Koenig) and b) replaced ASP as the term of choice but the fact that ASP could be replaced at all is an indication of its lack of success. While various web based businesses from that period are not only still with us, but in Amazon and Google among the largest in the world, those attempting to sell software via the web rather than deploying it on premise generally did not survive.

A decade plus later, however, and not only has the SaaS model itself survived, but it is increasingly the default approach. The point here isn’t to examine the mechanics of the SaaS business, however; we’ve done that previously (see here or here, for example). The point of bringing up SaaS here, rather, is to serve as a reminder that there’s a difference between model and execution.

Too often in this industry, we look upon a market failure as a permanent indictment of potential. If it didn’t work once, it will never work.

The list of technologies that have been dismissed because they initially failed or seemed unimpressive is long: virtualization was widely regarded as a toy, it’s now an enterprise standard. Smart people once looked at containers and said “neato, why would you want to do that?” Two plus years after Amazon’s creation of the cloud market, then Microsoft CTO Ray Ozzie admitted that cloud “isn’t being taken seriously right now by anybody except Amazon.” In the wake of the anemic adoption – particularly relative to Amazon’s IaaS alternative – of the first iterations of PaaS market pioneers Force.com and Google App Engine, many venture capitalists decided that PaaS was a model with no future. DVCS tools like Git were initially scorned and reviled by developers because they were different on a fundamental level.

In each case, it’s important to separate the model from the execution. Too often, failures of the latter are perceived as a fatal flaw in the former. In the case of PaaS, for example, it’s become obvious that the lack of developer adoption was driven by the initial constraints of the first platforms; not having to worry about scaling was an attractive feature, but not worth the sacrifice of having to develop an application in a proprietary language against a proprietary backend that ensured the application could never be easily ported. Half a dozen years later, PaaS platforms are now not only commonly multi-runtime but open source, and growth is strong.

SaaS, meanwhile, would prove to be an excellent model over time, but initially had to contend with headwinds consisting of inconsistent and asymmetrically available broadband, far more functionally limited browser technologies and a user population both averse to risk and brought up on the literal opposite model. In retrospect, it’s no surprise that the ASP market failed; indeed, it’s almost more surprising that the SaaS market followed so quickly on its heels.

In both cases, the initial failures were not attributable to the models. There is in fact demand for PaaS and SaaS, it was simply that the vendors did not (PaaS) or could not (SaaS) execute properly at the time.

Given the rate and pace of change in the technology industry, it is both necessary and inevitable that new technology and approaches are viewed skeptically. As with most innovation, in the technology world or outside of it, failure is the norm. But critical views notwithstanding, it’s important to try and understand the wider context when evaluating the relative merits of competing models. It may well be that the model itself is simply unworkable. But in an industry where what is old is new again, daily, it is far more likely that a current lack of success is due to a failure of or inability to (due to market factors) execute.

In which case, you may want to give that “failed” market a second look. Opportunity may lie within.

Categories: Business Models.

A Few Suggestions for Briefing Analysts

One of the things that happens when you’re a developer focused analyst firm these days is that you talk to a lot of companies. The conversations analysts have with commercial vendors or developers about their projects are called briefings.

Whether the company or project is large or small, old or new, there are always ways to use our collective time – meaning the analyst’s and the company/developer’s – more efficiently and effectively. Having been doing this analyst thing for a little while now, I have a few ideas on what some of those ways might be and thought I’d share them. For anyone briefing an analyst then, I offer the following hopefully helpful suggestions. Best case they’ll make better use of your time, worst case you make the analyst’s life marginally easier, which probably can’t hurt.

  1. Determine how much time you have up front
    This will tend to vary by analyst firm, and sometimes by analyst. At RedMonk, for example, we limit briefings with non-clients to a half hour, a) because we have to talk to a lot of people and b) because very few people have a problem getting us up to speed in that time. It’s important, however, to be aware of this up front. If you think you have an hour, but only have half that, you might present the materials differently.
  2. Unless you’re solving a unique problem, don’t spend your time covering the problem
    If the analyst you’re speaking with is capable, they already understand it well, so time describing it is effectively wasted time. If there’s some aspect of a given market that you perceive differently and break with the conventional wisdom, by all means explain your unique vision of the world (and expect pushback). But a lot of presentations, possibly because they originated as material for non-analysts, spend time describing a market that everyone on the call likely already understands. Jumping right to how you are different, then, is more productive.
  3. If you’re just delivering slides and they’re not confidential (see #4), do not use web meeting software
    If you need to demo an application, web meeting software is acceptable. If you’re just going over slides that aren’t confidential, skip it. Inevitably the meeting software won’t work for someone; they don’t have the right plugin, a dependency is missing, their connections is poor, etc. The downtime while everyone else is waiting for the one meeting participant to download yet another version of web meeting software they probably don’t want is time that everyone else loses and can never get back. Also, it’s nice for analysts to have slides to refer to later.
  4. Don’t have confidential slides
    If you’re actively engaging with an analyst in something material, a potential acquisition for example, confidential slides are pretty much unavoidable. But if you’re doing a simple product briefing, lose the confidential slides. It makes it more difficult to recall later – particularly if a copy of the slides is not made available – what precisely is confidential, and what is not. Which means that analysts may be reticent to discuss news or information you’d like them to, due to the cognitive overhead of having to remember which 5 slides out 40 were confidential. When it comes time to present confidential material, just note that and walk through it verbally.
  5. If you spend the entire time talking, you may miss out on the opportunity for questions later
    It’s natural to want to talk about your product, and the best briefings are conducted by people with good energy and enthusiasm for what they do. That being said, making sure you leave time for questions can gain you valuable insights into what part of your presentation isn’t clear, and – depending on the analyst/firm – may lead to a two way conversation where you can get some of your own questions answered.
  6. Don’t use the phrase “market leader,” let the market tell us that
    This is perhaps just a pet peeve of mine, but my eyebrows definitely go up when vendors claim to be the “market leader.” This is for a few reasons. First, because genuine market leaders should not have to remind you of that. Second, what is the metric? Analysts may not agree with your particular yardstick. Third, because your rankings may not reflect an analyst’s view of the market, and while disagreement is normal it can sidetrack more productive conversations.
  7. Analysts aren’t press, so treating them that way is a mistake
    While frequently categorized together, analysts and press are in reality very different. Attitudes and towards and incentives regarding embargoes, for example, are entirely distinct. Likewise, many vendors and their PR teams send out “story ideas” to analysts, which is pointless because analysts don’t produce “stories” and are rarely on deadline in the way that the press is. What we tell clients all the time is that our job is not to break news or produce “scoops,” it’s to understand the market. If you treat analysts as press that is trying to extract information from you for that purpose, you may miss the opportunity to have a deeper, occasionally confidential, dialogue with an analyst.
  8. Make sure the analyst covers your space; if you don’t know, just ask
    Every analyst, whether generalist or specialist, will have some area of focus. Before you spend your time and theirs describing your product or service, it’s important to determine whether or not they cover your space at all. Every so often, for example, vendor outreach professionals will see that we cover “developers” and try to schedule a briefing for their bodyshop offering developmental services. Given that we don’t generally cover professional services, this isn’t a good use of anybody’s time. The simplest way of determining whether they cover your category, of course – assuming you can’t determine this from their website, Twitter bio, prior coverage, etc – is to just ask.
  9. Asking for feedback “after the call”
    In general, it seems like a harmless request to make at the end of a productive call: “If you think of any other feedback for us after the call, feel free to send this along.” And in most cases, it is relatively innocuous. Another way of interpreting this request, however, is: “Feel free to spend cycles thinking about us and send along free feedback after we’re done.” So you might consider using this request sparingly.
  10. Don’t ask if we want to receive information: that’s just another email thread
    There are very few people today who don’t already receive more email than they want or can handle. To make everyone’s lives simpler, then, it’s best to skip emails that take the form “Hi – We have an important announcement. Would you like to receive our press release concerning this announcement? If so send us an email indicating that you’ll respect the embargo.” As most analysts will respect embargoes – because we’re not press (see #7) – asking an analyst to reply to an email to get yet another email in return is a waste of an email thread. Your best bet is to maintain a list of trusted contacts, and simply distribute the material to them directly.

Those are just a few that occur off the top of my head based on our day to day work. Do these make sense? Are there other questions, or suggestions from folks in the industry?

Categories: Industry Analysis.

The 2014 Monktoberfest

Last Thursday at ten in the morning, this auditorium was full because I made a joke four years ago.

Describing the Monktoberfest to someone who has never been is difficult. Should we focus on the content, where we prioritize talks about social and tech that don’t have a home at other shows but make you think? Or the logistics, where we try to build a conference that loses the things we don’t enjoy from other conferences? Or maybe the most important thing is the hallway track, which is another way of saying the people?

Whatever else it may be, the Monktoberfest is different. It’s different talks, in a different city, given and discussed by different people. Some of those people are developers with a year or two of experience. Others are founders and CEOs. People helping to decide the future of the internet. Those in business school to help build the businesses that will be run on top of it. Startups meeting with incumbents, cats and dogs living together.

Which is, hopefully, what makes it as fun as it is professionally useful. It doesn’t hurt, of course, that the conference’s “second track” – my thanks to Alex King for the analogy – is craft beer.

During the day our attendees are asked to wrap their minds around complicated, nuanced and occasionally controversial issues. What are the social implications and ethics of running services at scale? When you cut through the hype, what does IoT mean for our lives and the way we play? Perhaps most importantly, how is our industry industry actually performing with respect to gender issues and diversity? And what can we, or what must we, do to improve that?

To assist with these deliberations, and to simultaneously expand horizons on what craft beer means, we turn lose two of the best beer people in the world, Leigh and Ryan Travers who run Stillwater Artisinal Ales’ flagship gastropub, Of Love and Regret, down in Baltimore. Whether we’re serving then the Double IPA that Beergraphs ranks as the best beer in the world, canned fresh three days before, or a 2010 Italian sour that was one of 60 bottles ever produced, we’re trying to deliver a fundamentally different and unique experience.

As always, we are not the ones to judge whether we succeeded in that endeavor, but the reactions were both humbling and gratifying.

Out of all of those reactions, however, it is ones like this that really get to us.

The fact that many of you will spend your vacation time and your own money to be with us for the Monktoberfest is, quite frankly, incredible. But it just speaks to the commitment that attendees have to make the event what it is. How many conference organizers, for example, are inundated with offers of help – even if it’s moving boxes – ahead of the show? How many are complimented by the catering staff, every year, that our group is one of the nicest and most friendly they have ever dealt with? How many have attendees that moved other scheduled events specifically so that they could attend the Monktoberfest?

This is our reality.

And as we say over and over, it is what makes all the blood, sweat and tears – and as any event organizer knows, there are always a lot of all three – worth it.

The Credit

Those of you who were at dinner will have heard me say this already, but the majority of the credit for the Monktoberfest belongs elsewhere. My sincere thanks and appreciation to the following parties.

  • Our Sponsors: Without them, there is no Monktoberfest
    • IBM: Once again, IBM stepped up to be the lead sponsor for the Monktoberfest. While it has been over a hundred years since the company was a startup, it has seen the value of what we have collectively created in the Monktoberfest and provided the financial support necessary to make the show happen.
    • Red Hat: As the world’s largest pure play open source company, there are few who appreciate the power of the developer better than Red Hat. Their support as an Abbot Sponsor – the fourth year in a row they’ve sponsored the conference, if I’m not mistaken – helps us make the show possible.
    • Metacloud: Though it is now part of Cisco, Metacloud stood alongside of Red Hat to be an Abbot sponsor and gave us the ability to pull out all the stops – as we are wont to do.
    • EMC: When we post the session videos online in a few weeks, it is EMC that you will have to thank.
    • Mandrill: Did you enjoy the Damariscotta river oysters, the sushi from Miyake, the falafel and sliders bar, or the mac and cheese station? Take a minute to thank the good folks from Mandrill.
    • Atlassian: Whenever you’re enjoying your shiny new Hydro-Flask 40 oz growler – whether it’s filled with a cold beverage or hot cocoa – give a nod to Atlassian, who helped maked them possible. Outside certainly approves of the choice.
    • Apprenda / HP: From the burrito spread to the Oxbow-infused black bean soup, Apprenda and HP are responsible for your lunch.
    • WePay: Like your fine new Teku stemmed tulip glassware? Thank WePay.
    • AWS/BigPanda/CohesiveFT/HP: Maybe you liked the ginger cider, maybe it was the exceedingly rare Italian sour, or maybe still it was the Swiss stout? These are the people that brought it to you.
    • Cashstar: Liked the Union Bagels on Thursday or the breakfast burritors? That was Cashstar’s doing.
    • O’Reilly: Lastly, we’d like to thank the good folks from O’Reilly for being our media partner yet again and bringing you free books.
  • Our Speakers: Every year I have run the Monktoberfest I have been blown away by the quality of our speakers, a reflection of their abilities and the effort they put into crafting their talks. At some point you’d think I’d learn to expect it, but in the meantime I cannot thank them enough. Next to the people, the talks are the single most defining characteristic of the conference, and the quality of the people who are willing to travel to this show and speak for us is humbling.
  • Ryan and Leigh: Those of you who have been to the Monktoberfest previously have likely come to know Ryan and Leigh, but for everyone else they reall are one of the best craft beer teams not just in this country, but the world. And they’re even better people, having spent the better part of the last few months sourcing exceptionally hard to find beers for us. It is an honor to have them at the event, and we appreciate that they take time off from running the fantastic Of Love & Regret to be with us.
  • Lurie Palino: Lurie and her catering crew have done an amazing job for us every year, but this year was the most challenging yet due to some late breaking changes in the weeks before the event. As she does every year, however, she was able to roll with the punches and deliver on an amazing event yet again. With no small assist from her husband, who caught the lobsters, and her incredibly hard working crew at Seacoast Catering.
  • Kate (AKA My Wife): Besides spending virtually all of her non-existent free time over the past few months coordinating caterers, venues and overseeing all of the conference logistics, Kate was responsible for all of the good ideas you’ve enjoyed, whether it was the masseuses two years ago, the cruise last year or the inspired choice of venue this. And she gave an amazing talk on the facts and data behind sexual harassment. I cannot thank her enough.
  • The Staff: Juliane did yeoman’s work organizing many aspects of the conference, including the cruise, and with James secured and managed our sponsors. Marcia handled all of the back end logistics as she does so well – and put up with the enormous growler boxes living at her house for a week. Kim not only worked both days of the show, but traveled down to Baltimore and back by car simply to get things that we couldn’t get anywhere else. Celeste, Cameron, Rachel, Gretchen, Sheila and the rest of the team handled the chaos that is the event itself with ease. We’ve got an incredible team that worked exceptionally hard.
  • Our Brewers: We picked a tough week for brewer appearances this year, as we overlapped with no fewer than three major beer festivals, but The Alchemist was fantastic as always about making sure that our attendees got some of the sweet nectar that is Heady Topper, and Mike Guarracino of Allagash was a huge hit attending both our opening cruise and Thursday dinner. Oxbow Brewing, meanwhile, not only connected us with a few hard to get selections, but loaned us some of the equipment we needed to have everything on tap. Thanks to all involved.
  • Erik Dasque: As anyone who attended dinner is aware, Erik was our drone pilot for the evening. He was gracious enough to get his Phantom up into the air to capture aerial shots of the Audubon facility as well as video of our arriving attendees. Wait till you see his video. In the meantime, here’s a picture.

With that, this year’s Monktoberfest is a wrap. On behalf of myself, everyone who worked on the event, and RedMonk, I thank you for being a part of what we hope is a unique event on your schedule. We’ll get the video up as quickly as we can so you can share your favorite talks elsewhere.

For everyone who was with us, I owe you my sincere thanks. You are why we do this, and you are the Monktoberfest. Stay tuned for details about next year, as we’ve got some special things planned for our 5th anniversary, and in the meantime you might be interested in Thingmonk or the Monki Gras, RedMonk’s other two conferences, as well as the upcoming IoT at Scale conference we’re running with SAP in a few weeks.

Categories: Conferences & Shows.

A Swing of the Pendulum: Are Fragmentation’s Days Numbered?

Foucault's Pendulum

One of the lessons that has stayed with me all these years removed from my History major is the pendulum theory. In short, it asserts that history typically moves within a pendulum’s arc: first swinging in one direction, then returning towards the other. I’ve been thinking about this quite a bit in recent months as the predictable result of widespread developer empowerment becomes more and more visible in virtually all of the metrics we track. Unsurprisingly, when you have two populations making decisions, the larger one leads to a wider array of outcomes. CIOs, as an example, were long content to consolidate on a limited number of runtimes – Java, .NET and a few others. All of the data we see, however, suggests that as the New Kingmakers have begun to rise up and act on their own initiative, the distribution of runtimes employed has exploded. The pendulum, quite obviously, had swung from centralized to fragmented, driven by a fundamental shift in the way that technologies were selected.
The question I’ve been pondering is simple: when does it begin to swing back in the other direction?

If there is any reversal here, it will come from developers. Even the large, CIO-centric incumbents are aware today that developers are in charge, so there’s no evidence to suggest that CIOs have a plausible strategy for putting developers back under thumb. But while over the last few years newly empowered developers have shown an insatiable appetite for new technologies, it hasn’t been clear that this trajectory was sustainable longer term.

Which is I’ve been paying attention, looking for evidence that the pendulum swing might be slowing – even reversing. The data is inconclusive. As Donnie has noted, there have only been five languages that really mattered on a volume basis on Github: JavaScript, Ruby, Java, PHP, and Python. And yet our rankings indicate that while they do indeed represent the fat part of the tail, there is substantial, ongoing volume usage of maybe twenty to thirty on top of that.

What the data won’t say, however, developers themselves will. Witness this piece from Tim Bray:

There is a re­al cost to this con­tin­u­ous widen­ing of the base of knowl­edge a de­vel­op­er has to have to re­main rel­e­van­t. One of today’s buz­zwords is “full-stack developer”. Which sounds good, but there’s a lit­tle guy in the back of my mind scream­ing “You mean I have to know Gra­dle in­ter­nals and ListView fail­ure modes and NSMan­agedOb­ject quirks and Em­ber con­tain­ers and the Ac­tor mod­el and what in­ter­face{} means in Go and Dock­er sup­port vari­a­tion in Cloud provider­s? Color me sus­pi­cious.

Which links to this piece by Ed Finkler:

My tolerance for learning curves grows smaller every day. New technologies, once exciting for the sake of newness, now seem like hassles. I’m less and less tolerant of hokey marketing filled with superlatives. I value stability and clarity.

Which elicited this reponse from Marco Arment:

I feel the same way, and it’s one of the reasons I’ve lost almost all interest in being a web developer. The client-side app world is much more stable, favoring deep knowledge of infrequent changes over the constant barrage of new, not necessarily better but at least different technologies, libraries, frameworks, techniques, and methodologies that burden professional web development.

Which in turn prompted a response from Matt Gemmell entitled “Confessions of an Ex-Developer”:

I’m glad there are no compilers (visible) in my life. I’m also glad that I can view the WWDC keynote as a tourist, without any approaching tension headache as I think about what I’ll need to add, or change, or remove. I can drift languidly along on the slow-moving current of the everyday web, indulging an old habit when a rainy evening comes by.

It’s a profoundly relaxing thing to be able to observe the technology industry without being invested in it. I’m glad I’m not making software anymore.

To be clear, these are merely four developers. Four experienced developers, more importantly. It may very well be that their experiences are nothing more than a natural and understandable change in priorities that comes with age.

But their experience seems to mirror a logical reaction to a very rapid set of transformations in this industry. Given the hypothesis that the furious rate of change and creation in technology will at some point hit a point of diminishing returns, then become actively counterproductive, it follows that these could merely be the bleeding edge of a more active backlash against complexity. Developers have historically had an insatiable appetite for new technology, but it could be that we’re approaching the too-much-of-a-good-thing stage. In which case, the logical outcome will be a gradual slowing of fragmentation followed by gradual consolidation. Market outcomes would be dependent on individual differences between rates of change, the negative impacts of fragmentation and so on.

It may be difficult to conceive of a return to a more simple environment, but remember that the Cambrian explosion the current rate of innovation is often compared to was itself very brief – in geologic terms, at least. Unnatural rates of change are by definition unnatural, and therefore difficult to sustain over time. It is doubtful that we’ll ever see a return to the radically more simple environment created by the early software giants, but it’s likely that we’ll see dramatically fewer popular options per category.

Whether we’re reaching apex of the swing towards fragmentation is debatable, less so is the fact that the pendulum will swing the other way eventually. It’s not a matter of if, but when.

Categories: Programming Languages.

What is the Atomic Unit of Computing?

defining the unit of atomic weight

According to published reports, Docker (neé dotCloud) is in the process of securing $40M in financing. Update Originally mis-stated the amount of financing, but the substance of the post stands.

If popularity is a guiding metric, this infusion will come as no surprise. Docker is one of the fastest growing projects we have ever seen at RedMonk, and virtually no one we speak with is surprised to hear that. In a little over a year, Docker has exploded into a technology that is seeing near universal uptake, from traditional enterprise IT suppliers (e.g. Red Hat) to emerging infrastructure players (e.g. Google).

There are many questions currently being asked about Docker. Most obviously, why now? The idea of containers is not new, and conceptually can be dated back to the mainframe, with more recent implementations ranging from FreeBSD Jails to Solaris Zones. What is about Docker that it has captured mainstream interest where previous container technologies were unable to?

Rather than one explanation, it is likely a combination of factors. Most obviously, there is the popularity of the underlying platform. Linux is exponentially more popular today than any of the other platforms offering containers have been. Containers are an important, perhaps transformative feature. But they historically haven’t been enough to compel a switch from one operating system to another.

Perhaps more importantly, however, there are two larger industry shifts at work which ease the adoption of container technologies. First, there is the near ubiquity of virtualization within the enterprise. When Solaris Zones dropped in 2004, for example, VMware was six years old, five months from being bought by EMC (in a move that baffled the industry) and three years away from an IPO. Ten years later, and virtualization is, quite literally, everywhere. At OSCON, for example, one database expert noted that somewhere between 30% and 50% of his very large database workloads were running virtualized. The last workload to be virtualized, in other words, is almost half the time. Just as the ASP market failure paved the way for the later SAAS market entrants, the long fight for virtualization acceptance has likely eased the adoption of container technologies like Docker.

More specific to containers specifically, however, is the steady erosion in the importance of the operating system. To be sure, packaged applications and many infrastructure components are still heavily dependent on operating system-specific certifications and support packages. But it’s difficult to make the case that the operating system is as all powerful as it was, given the complete reversal of attitudes towards Ubuntu in the cloud era. Prior to the ascension of Amazon and other public cloud suppliers, large scale enterprise support on a general basis was near zero. Today, besides being by far and away the most popular distribution on Amazon, Ubuntu is supported by those same enterprise stalwarts from HP to IBM. Nor has IAAS been the only factor in the ongoing disintermediation of the operating system; as discussed previously, PAAS is the new middleware, and middleware’s explicit mission has historically been to abstract the application from the operating system underneath it.

These developments imply that there is a shift at work in the overall market importance of the operating system (a shift that we have been expecting since 2010), which in turn helps explain how containers have become so popular so quickly. Unlike virtual machines, which replicate an entire operating system, containers act like a diff of two different images. Operating system components to the two images are shared, leaving the container to house just the difference: little more the application and any specific dependent libraries, etc. Which means that containers are substantially lighter weight than full VMs. If applications are heavily operating system dependent and you run a mix of operating systems, containers will be problematic. If the operating system is a less important question, however, containers are a means of achieving much higher application density on a given instance versus virtual machines fully emulating an operating system.

Taken in the aggregate, this is at least a partial explanation for the question of “why now?” As is typical with dramatic movements, Docker’s success is as much about context as the quality of the underlying technology – intending no disrespect to the Docker engineers, of course. Engineering is critical, it’s just that timing is usually more critical.

The most important question about Docker, however, isn’t “why now?” It is rather the one being asked more rarely today, by those struggling to understand where the often overlapping puzzle pieces fit. The explosion of Docker’s popularity begs a more fundamental question: what is the atomic unit of infrastructure moving forward? At one point in time, this was a server: applications were conceived of, and deployed to, a given physical machine. More recently, the base element of an infrastructure was a virtual recreation of that physical machine. Whether you defined that as Amazon did or VMware might was less important than the idea that an image resembling a server, from virtualized hardware and networking interfaces to a full instance of an operating system, was the base unit from which everything else was composed.

Containers generally and Docker specifically challenge that notion, treating the operating system and everything beneath as a shared substrate, a universal foundation that’s not much more interesting the raised floor of a datacenter. For containers, the base unit of construction is the application. That’s the only real unique element.

What this means yet is undetermined. Users are for the most part years away from understanding this division, let alone digesting its implications. But vendors and projects alike should, and in some cases are, beginning to critically evaluate the lens through which they view the world. Infrastructure players like VMware and the OpenStack ecosystem, for example, need to project forward the potential opportunities and threats presented by an application as opposed to VM-centric worldview, while Docker and others in similar orbits (e.g. Cloud Foundry) conversely need to consider how to traverse the comprehension gap between what users expect and what they get.

Google App Engine, Force.com and others, remember, tried to sublimate the underlying infrastructure in the first generation of PAAS offerings and the result was a market dwarfed by IAAS – which not coincidentally looked a lot more like the physical infrastructure customers were used to. But as the Turkey Fallacy states, “it hasn’t happened so it won’t happen” is not the most sustainable defense imaginable. Just because PAAS struggled to get customers beyond thinking in terms of physical hardware doesn’t mean that Docker will as well.

In any event, expect to see players on both sides of the VM / app divide aggressively jockeying for position, as no one wants to be the one left without a chair when the music stops.

Categories: Containers, Open Source, Virtualization.

SaaS vs The Perpetual License Model

Founded in 1998, VMware went public on August 14, 2007. Founded a year after VMware in 1999, Salesforce.com held its initial public offering three years earlier on June 23, 2004. From a temporal standpoint, then, the two companies are peers. Which is one reason that it’s interesting to examine how they have performed to date, and how they might perform moving forward.

Another is the fact that they represent, at least for now, radically different approaches to the market. VMware, of course, is the dominant player in virtualization and markets adjacent to that space. Where Microsoft built its enormous value in part by selling licenses to the operating system upon which workloads of all shapes and sizes were run, VMware’s revenue has been driven primarily by the sales of software that makes operating systems like Windows virtual.

Salesforce.com, on the other hand, has been since its inception the canonical example for what eventually came to be known as Software-as-a-Service, the sale of software centrally hosted and consumed via a browser. Not only is Salesforce.com’s software – or service, as they might prefer it be referred to – consumed very differently than software from VMware, the licensing and revenue model of services-based offerings is quite distinct from the traditional perpetual license business.

It has been argued in this space many times previously (see here, for example) that the perpetual license model that has dominated the industry for decades is increasingly under pressure from a number of challengers ranging from open source competition to software sold, as in the case of Salesforce.com, as a service. In spite of the economic challenge inherent to running a business where your revenue is realized over a long period of time that competes with those that get all of their money up front, the bet here is that the latter model will increasingly give way to the former.

Which is why it’s interesting, and perhaps informative, to examine the relative performance, and market valuation, of two peers that have taken very different approaches. What does the market believe about the two models, and what does their performance tell us if anything?

It is necessary to acknowledge before continuing that in addition to the difference in terms of model, Salesforce and VMware are not peers from a categorical standpoint. The former is primary an applications vendor, its investments in platforms such as Force.com or Heroku notwithstanding, while the latter is an infrastructure player, in spite of its dalliances with application plays such as Zimbra. While the comparison is hardly apples to apples, therefore, it is nevertheless instructive in that both vendors compete for a percentage of organizational IT spend. But keep that distinction in mind regardless.

As for the most basic metric, market capitalization, VMware is currently worth more than Salesforce: $42 billion to $35 billion as I write this. Even the most cursory examination of some basic financial metrics will explain why. Since it began reporting in 2003, Salesforce.com has on a net basis lost just over $350 million. VMware, for its part, has generated a net profit of almost $4 billion.

If we examine the quarterly gross profit margins of the two entities over time, these numbers make sense.

Apart from some initially comparable margins for Salesforce, VMware has consistently commanded a higher margin than its SaaS-based counterpart. While this chart likely won’t surprise anyone who’s tracked the markets broadly or the companies specifically, however, it’s interesting to note the fork in the 2010-2011 timeframe. Through 2010, the margins appeared to be on a path to converge; after, they diverged aggressively. Nor is this the only metric in which this pattern is visible.

We see the same trajectories in a plot of the quarterly net income, for example, but this is to be expected since this is in part a function of the profit generated.

The question is what changed in the 2010 timeframe, and one answer appears to be investments in scale. The following chart depicts the gross property, plant and equipment charges for both firms over the same timeframe.

Notice in particular the sharp spike beginning in 2010-2011 for Salesforce. After six years as a public entity, the company began investing in earnest, which undoubtedly had consequences partially reflected in the charts above. VMware’s expenditures here are interesting for their part, because the conventional wisdom is that it is the services firms like Salesforce, Amazon, Google or increasingly Microsoft whose PP&E will reflect their necessary investments in the infrastructure to deliver services at scale. Granted, VMware’s vCloud Hybrid Service is intended to serve as a public infrastructure offering, but this was intended to be an “an asset-light model” which presumably would not have commanded the same infrastructure investments. Nevertheless, VMware outpaced Salesforce until 2014.

The question, whether it’s from the perspective of an analyst or an investor, is what the returns have been on this dramatic increase in spending. Obviously in Salesforce’s case, its capital investments have dragged down its income and margins, while VMware’s dominant market position has allowed it to not only sustain its pricing but grow its profitability. But what about revenue growth?

One of the strongest arguments in favor of SaaS products is convenience; it’s far less complicated to sign up to a service hosted externally than it is to build and host your own. If convenience is indeed a driver for adoption, greater revenue growth is one potential outcome: if it’s easier to buy, it will be bought more. This is, in fact, a necessity for Salesforce: if you’re going to trade losses now for growth, you need the growth. To some extent, this is what we see when we compare the two companies.

The initial decline from 70+ percent growth for both companies is likely the inevitable product of simple math: the more you make, the harder it is to grow. While we can discount the first half of this chart, the second half is intriguing in that it is a reversal of the pattern we have seen above. While VMware solidly outperformed Salesforce at a growing rate in profit and income, Salesforce, beginning about a year after its PP&E investments picked up, has grown its revenue at a higher rate than has VMware. Early in this period you could argue the rate differential was a function of revenue disparities, but the delta between the revenue numbers last quarter was less than 10%.

In general, none of these results should be surprising. VMware has successfully capitalized on a dominant position in a valuable market and the financial results demonstrate that. Salesforce, as investors appear to have recognized, is clearly trading short term losses against the longer term return. While there is some evidence to suggest that Salesforce’s strategy is beginning to see results, and VMware is probably paying closer attention to its overall ability to grow revenue, it’s still very early days. It’s equally possible that one or both are poor representatives for their respective approach. It will be interesting to monitor these numbers over time, however, to try and test how the two models continue to perform versus one another.

Categories: Business Models, Software-as-a-Service.

The RedMonk Programming Language Rankings: June 2014

As we settle into a roughly bi-annual schedule for our programming language rankings, it is now time for the second drop of the year. This being the second run since GitHub retired its own rankings forcing us to replicate them by querying the GitHub archive, we are continuing to monitor the rankings for material differences between current and past rankings. While we’ve had slightly more movement than is typical, however, by and large the results have remained fairly consistent.

One important trend worth tracking, however, is the correlation between the GitHub and Stack Overflow rankings. This is the second consecutive period in which the relationship between how popular a language is on GitHub versus Stack Overflow has weakened; this run’s .74 is in fact the lowest observed correlation to date. Historically, the number has been closer to .80. With only two datapoints indicating a weakening – and given the fact that at nearly .75, the correlation remains strong – it is premature to speculate as to cause. But it will be interesting to monitor this relationship over time; should GitHub and Stack Overflow continue to drift apart in terms of programming language traction, it would be news.

For the time being, however, the focus will remain on the current rankings. Before we continue, please keep in mind the usual caveats.

  • To be included in this analysis, a language must be observable within both GitHub and Stack Overflow.
  • No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
  • There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
  • All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
  • GitHub language rankings are based on raw lines of code, which means that repositories written in a given language that include a greater number amount of code in a second language (e.g. JavaScript) will be read as the latter rather than the former.
  • In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.

lang-rank-614-wm

(click to embiggen the chart)

Besides the above plot, which can be difficult to parse even at full size, we offer the following numerical rankings. As will be observed, this run produced several ties which are reflected below.

1 Java / JavaScript
3 PHP
4 Python
5 C#
6 C++ / Ruby
8 CSS
9 C
10 Objective-C
11 Shell
12 Perl
13 R
14 Scala
15 Haskell
16 Matlab
17 Visual Basic
18 CoffeeScript
19 Clojure / Groovy

Most notable for advocates of either Java or JavaScript is the tie atop these rankings. This finding is not surprising in light of the fact that one or the other – most commonly JavaScript – has been atop our rankings as long as we have had them, with the loser invariably finishing in second place. For this run, however, the two languages find themselves in a statistical tie. While the actual placement is, as mentioned above, not particularly significant from an overall share perspective, the continued, sustained popularity of these two runtimes is notable.

Aside from that tie, the rest of the Top 10 is relatively stable. Python retook fourth place from C#, and CSS pushed back C and Objective-C, but these changes notwithstanding the elite performers in this ranking remain elite performers. PHP, as one example, remains rock steady in third behind the Java/JavaScript tandem, and aside from a slight decline from Ruby (5 in 2013, 7 today) little else has changed. Which means that the majority of the interesting activity occurred further down the spectrum. A few notes below on notable movements from selected languages.

  • R: Advocates of R will be pleased by the language’s fourth consecutive gain in the rankings. From 18 in January of 2013 to 13 in this run, the R language continues to rise. Astute observers might note by comparing plots that this is in part due to growth on GitHub; while R has always performed well on Stack Overflow due to the volume of questions and answers, it has tended to be under-represented on GitHub. This appears to be slowly changing, however, in spite of competition from Python, issues with the runtime itself and so on.
  • Go: Like R, Go is sustaining its upward trajectory in the rankings. It didn’t match its six place jump from our last run, but the language moved up another spot and sits just outside the Top 20 at 21. While we caution against reading much into the actual placement on these rankings, where differences between spots can over-represent only marginal differences in performance, we do track changes in trajectory closely. While its 21st spot, therefore, may not distinguish it materially from the languages directly above or behind it, its trendline within these rankings does. Given the movement to date, as well as the qualitative evidence we see in terms of projects moving to Go from other alternatives, it is not unreasonable to expect Go to be a Top 20 language within the next six to twelve months.
  • Perl: Perl, on the other hand, is trending in the opposite direction. Its decline has been slow, to be fair, dropping from 10 only down to 12 in our latest rankings, but it’s one of the few Tier 1 languages that has experienced a decline with no offsetting growth since we have been conducting these rankings. While Perl was the glue that pulled together the early web, many believe the Perl 5 versus Perl 6 divide has fractured that userbase, and at the very least has throttled adoption. While the causative factors are debatable, however, the evidence – both quantitative and qualitative – points to a runtime that is less competitive and significant than it once was.
  • Julia/Rust: Two of the first quarter’s three languages to watch – Elixir didn’t demonstrate the same improvement – continued to their rise. Each jumped 5 spots from 62/63 to 57/58. This leaves them still well outside the second tier of languages, but they continue to climb in our rankings. For differing reasons, these two languages are proving to be popular sources of investigation and experimentation, and it’s certainly possible that one or both could follow in Go’s footsteps and make their way up the rankings into the second tier of languages at a minimum.
  • Dart: Dart, Google’s potential replacement for JavaScript, is a language we receive period inquiries about, although not as a high a volume of them as might be expected. It experienced no movement since our last ranking, placing 39 in both of our last two runs. And while solidly in the second tier at that score, it hasn’t demonstrated to date the same potential for rapid uptake that Go has – in all likelihood because its intended target, JavaScript, has sustained its overwhelming popularity.
  • Swift: Making its debut on our rankings in the wake of its announcement at WWDC is Swift, which checks in at 68 on our board. Depending on your perspective, this is either low for a language this significant or impressive for a language that is a few weeks old. Either way, it seems clear that – whatever its technical issues and limitations – Swift is a language that is going to be a lot more popular, and very soon. It might be cheating, but Swift is our language to watch this quarter.

Big picture, the takeaway from the rankings is that the language diversity explored most recently by my colleague remains the norm. While the Top 20 continues to be relatively static, we do see longer term trends adding new players (e.g. Go) to this mix. Whatever the resulting mix, however, it will ultimately be a reflection of developers’ desires to use the best tool for the job.

Categories: Programming Languages.