Skip to content

Revisiting the 2014 Predictions

With the calendar now reading January, it’s time to look ahead to 2015 and set down some predictions for the forthcoming year. As is the case every year, this is a two part exercise. First, reviewing and grading the predictions from the prior year, and second, the predictions for this one. The results articulated by first hopefully allow the reader to properly weight the contents of the second – with one important caveat that I’ll get to.

This year will be the fifth year I have set down annual predictions. For the curious, here is how I have fared in years past.

Before we get to the review of 2014’s forecast, one important note regarding the caveat mentioned above. Prior to 2013, the predictions in this space focused on relatively straightforward expectations with well understood variables. After some ferocious taunting constructive feedback from Bryan Cantrill, however, emphasis shifted towards trying to better anticipate the totally unexpected than the reverse.

You can see from the score how that worked out. Nevertheless, we press on. Without further delay, here are the 2014 predictions in review.


38% or Less of Oracle’s Software Revenue Will Come from New Licenses

As discussed in November of 2013 and July of 2012, while Oracle has consistently demonstrated an ability to grow its software related revenues the percentage of same derived from the sale of new licenses has been in decline for over a decade. Down to 38% in 2013 from 71% in 2000, there are no obvious indications that 2014 will buck this trend.

Because of some changes in reporting, it’s a bit tricky to answer this one simply. When Oracle reported its financial results in 2012, their consolidated statement of operations included just two categories in software revenue: “New software licenses” and “Software license updates and product support.” The simple, classic perpetual license software business, in other words. A year later, however, Oracle was still reporting in two categories, but “New software licenses” had become “New software licenses and cloud software subscriptions.”

While this made it impossible to compare 2012 to 2013 on an Apples to Apples basis, the basic premise held: theoretically reporting new software licenses-only in 2012, the percentage of overall software revenue that represented was 37.93%. The number in 2013, with “cloud software subscriptions” now folded in? 37.58%. With or without cloud revenue included, then, a distinct minority – and declining percentage – of the overall Oracle revenue was derived from the sale of new licenses.

For the fiscal year 2014, however, Oracle finally abandoned the two category reporting structure and broke cloud revenue out into not just one but two entirely new categories. In its 2014 10-K, Oracle provides revenue numbers for the following:

  • New software licenses
  • Cloud software-as-a-service and platform-as-a-service
  • Cloud infrastructure-as-a-service
  • Software license updates and product support

Which begs the question: if we’re trying to determine what percentage of Oracle’s software revenue derives from the sale of new software licenses, do we include one or both cloud categories, or evaluate software on a stand alone basis?

If the latter, the 2015 prediction is easily satisfied. If we exclude cloud revenue, only 32.25% of Oracle’s non-hardware revenue was extracted from new software licenses – a drop of 5.68% since the last time Oracle only reported on that category. But given that 2013 conflated software and cloud revenue and was used as the basis for the 2015 prediction here, it seems only fair to use that as the basis for judgement.

So what percentage of overall revenue did new cloud (IaaS, PaaS and SaaS included) and software licenses generate for the company in 2014? 37.65%.

Which means we’ll count this prediction as a hit.

The Biggest Problem w/ IoT in 2014 Won’t Be Security But Compatibility

Part of the promise of IoT devices is that they can talk to each other, and operate more efficiently and intelligently by collaborating. And there are instances already where this is the case: the Nest Protect smoke alarm, for example, can shut off a furnace in case of fire through the Nest thermostat. But the salient detail in that example is the fact that both devices come from the same manufacturer. Thus far, most of the IoT devices being shipped are designed as individual silos of information. So much so, in fact, that an entirely new class of hardware – hubs – has been created to try and centrally manage and control the various devices, which have not been designed to work together. But while hubs can smooth out the rough edges of IoT adoption, they are more band-aid than solution.

And because this may benefit market leaders like Nest – customers have a choice between buying other home automation devices that can’t talk to their Nest infrastructure or waiting for Nest to produce ones that do – the market will be subject to inertial effects. Efforts like the AllSeen Alliance are a step in the right direction, but in 2014 would-be IoT customers will be substantially challenged and held back by device to device incompatibility.

If the high profile penetrations of JP Morgan, Sony et al had been IoT related, this prediction would have been more problematic. But while there were notable IoT related security incidents like those described in this December report in which the blast furnace in a German factory was remotely manipulated, in 2014 the bigger issue seems to have been compatibility.

Perhaps in recognition of this limiting factor, manufacturers have indicated that 2015 is going to see progress in this area. Early in January, for example, Nest announced at CES that it would partnering with over a dozen new third party vendors, from August to LG to Philips to Whirlpool. 2014 also saw the company acquire the manufacturer of a potentially complementary device, the Dropcam. This interoperation will be crucial to expanding the market as a whole, because connected devices unable to interoperate with each other are of far more limited utility.

I’ll count this as a hit.

Windows 7 Will Be Microsoft’s Most Formidable Competitor

The good news for Microsoft is that Windows 7 adoption is strong, with more than twice the share of Windows XP, the next most popular operating system according to Statcounter. The bad news for Microsoft is that Windows 7 adoption is strong.

With even Microsoft advocates characterizing Windows 8 as a “mess,” Microsoft has some difficult choices to make moving forward. Even setting aside the fact that mobile platforms are actively eroding the PC’s relevance, what can or should Microsoft tell its developers? Embrace the design guidelines of Windows 8, which the market has actively selected against? Or stick with Windows 7, which is widely adopted but not representative of the direction that Microsoft wants to head? In short, then, the biggest problem Microsoft will face in evangelizing Windows 8 is Windows 7.

The good news for Microsoft is that Windows 7 declined slightly, from 50.32% in January to 49.14% in December. The bad news is that Windows 8.1 (11.77%) is still behind Windows XP (11.93%) in share. Back on the bright side, that was up from Windows 8’s 7.57% in January and the next closest non-Microsoft competitor was Mac OS at 7.83%.

Still, it seems pretty clear that Windows 7 is Microsoft’s most formidable competitor – we’ll see how Windows 10 does against it. Hit.

The Low End Premium Server Business is Toast

Simply consider what’s happened over the last 12 months. IBM spun off its x86 server business to Lenovo, at a substantial discount from the original asking price if reports are correct. Dell was forced to go private. And HP, according to reports, is about to begin charging customers for firmware updates. Whether the wasteland that is the commodity server business is more the result of defections to the public cloud or big growth from ODMs is ultimately irrelevant: the fact is that the general purpose low end server market is doomed. This prediction would seem to logically dictate decommitments to low end server lines from other businesses besides IBM, but the bet here is that emotions win out and neither Dell nor HP is willing to cut that particular cord – and Lenovo is obviously committed.

It’s difficult to measure this precisely because players like Dell remain private and shipment volumes from ODM suppliers are opaque, but there are several things we know. In spite of growth in PCs, HP’s revenue was down 4% (1% net) in 2014. And while CEO Meg Whitman expects x86 servers to play a part in a 2015 rebound, there are no signs that that was the case in 2014. Cisco, meanwhile, which eclipsed HP for sales of x86 blade servers in Q1, grew its datacenter business (which includes servers) in 2014 at a 27.3% clip compared to the year prior, but that was down from 59.8% growth 2012-2013 and the 2014 revenue represents only 7.3% of Cisco’s total for the year.

Amazon, on the other hand, is growing by virtually any metric, and rapidly in terms of users, consumption metrics and its portfolio of available services. Nor is Amazon the only growth area in public cloud: DigitalOcean has become the fourth largest web host in the world in less than two years, according to Netcraft.

Whether you base it on Amazon’s one million plus customers, then, or the uncertain fortunes of the x86 alternatives, it’s clear that traditional x86 businesses remain in real trouble. Hit.


2014 Will See One or More OpenStack Entities Acquired

Belatedly recognizing that the cloud represents a clear and present danger to their businesses, incumbent systems providers will increasingly double down on OpenStack as their response. Most already have some commitment to the platform, but increasing pressure from public cloud providers (primarily Amazon) as well as proprietary alternatives (primarily VMware) will force more substantial responses, the most logical manifestation of which is M&A activity. Vendors with specialized OpenStack expertise will be in demand as providers attempt to “out-cloud” one another on the basis of claimed expertise.

There are a few acquisitions here that are not OpenStack entities but certainly influenced by same – HP/Eucalyptus and Red Hat/Inktank come to mind – but it’s not necessary to include these to make this prediction come true. Just in the last year we’ve seen EMC acquire Cloudscaling, Cisco pick up Metacloud and Red Hat bring on eNovance. That leaves a variety of players still on the board, from Blue Box to Mirantis to Piston, and it will be interesting to see whether further consolidation lies ahead. But in the meantime, this prediction can safely be scored as a hit.

The Line Between Venture Capitalist and Consultant Will Continue to Blur

We’ve already seen this to some extent, with Hilary Mason’s departure to Accel and Adrian Cockcroft’s move to Battery Ventures. This will continue in large part because it can represent a win for both parties. VC shops, increasingly in search of a means of differentiation, will seek to provide it with high visibility talent on staff and available in a quasi-consultative capacity. And for the talent, it’s an opportunity to play the field to a certain extent, applying their abilities to a wider range of businesses rather than strictly focusing on one. Like EIR roles, they may not be long term, permanent positions: the most likely outcome, in fact, is for talent to eventually find a home at a portfolio company, much as Marten Mickos once did at Eucalyptus from Benchmark. But in the short term, these marriages are potentially a boon to both parties and we’ll see VCs emerge as a first tier destination for high quality talent.

The year 2014 did see some defections to the VC ranks, but certainly nothing that could be construed as a legitimate trend for the year. This is a miss.

Netflix’s Cloud Assets Will Be Packaged and Create an Ecosystem Like Hadoop Before Them

My colleague has been arguing for the packaging of Netflix’s cloud assets since November of 2012, and to some extent this is already occurring – we spoke to a French ISV in the wake of Amazon reInvent that is doing just this. But the packaging effort will accelerate in 2014, as would-be cloud consumers increasingly realize that there is more to operating in the cloud than basic compute/network/storage functionality. From Asgard to Chaos Monkey, vendors are increasingly going to package, resell and support the Netflix stack much as communities have sprung up around Cassandra, Hadoop and other projects developed by companies not in the business of selling software. To give myself a small out here, however, I don’t expect much from the ecosystem space in 2014 – that will only come over time.

In spite of some pilot efforts here and there including services work, there was little “acceleration” of the packaging of Netflix’s cloud assets. This is a miss.


Disruption Finally Comes to Storage and Networking in 2014

While it’s infrequently discussed, networking and storage have proven to be largely immune from the aggressive commoditization that has consumed first major software businesses and then low end server hardware. They have not been totally immune, of course, but by and large both networking and storage have been relatively insulated against the corrosive impact of open source software – in spite of the best efforts of some upstart competitors.

This will begin to change in 2014. In November, for example, Facebook’s VP of hardware design disclosed that they were very close to developing open source top-of-rack switches. That open source would eventually come for both the largely proprietary networking and storage providers was always inevitable; the question was timing. We are beginning to finally seen signs that one or both will be disrupted in the current year, whether its through collective efforts like the Open Compute Project or simply clever repackaging of existing technologies – an outcome that seems more likely in storage than networking.

As discussed previously, strictly speaking, disruption had already come for storage at the time that this was originally written. As for networking, the disclosure that some of the largest potential networking customers – Amazon and Facebook, among others – are now designing and manufacturing their own networking gear instead of purchasing it from traditional suppliers was disruptive enough. The fact that it’s that Facebook’s custom network designs, at least, are likely to be released to the public should be that much more concerning to traditional networking suppliers.

With the caveat then that the storage timing, at least, was off, this is a hit.


The Most Exciting Infrastructure Technology of 2014 Will Not Be Produced by a Company That Sells Technology

More and more today the most interesting new technologies are being developed not by companies that make money from software – one reason that traditional definitions of “technology company” are unhelpful – but from those that make money with software. Think Facebook, Google, Netflix or Twitter. It’s not that technology vendors are incapable of innovating: there are any number of materially interesting products that have been developed for purposes of sale.

The difficulty, as I should know by now, with predictions like these, is that they’re dependent on arbitrary and subjective definitions – in this case what’s the most “exciting” project of 2014. While there are many potential candidates, however, for us at RedMonk, Docker was one of our most discussed infrastructure projects over the past calendar year. By a variety of metrics, it’s the one of the most quickly growing projects we have ever seen. The Google Trends graph above corroborates this, albeit in an understated manner.

As a result, it seems fair to argue that Docker is a good candidate for the most exciting infrastructure technology of 2014. And unfortunately for my prediction, it is in fact produced by a company that sells software. So this is a miss.

Google Will Buy Nest Google Will Move Towards Being a Hardware Company

In the wake of Google’s acquisition of Nest, which I cannot claim with a straight face that I would have predicted, this prediction probably would have been better positioned in the Safe or Likely categories, as it seemed to indicate a clear validation of this assertion. But then they went and sold Motorola to Lenovo, effectively de-committing from the handset business.

So while I don’t expect hardware to show up in the balance sheet in a meaningful way in 2014, it seems probable that by the end of the year we’ll be more inclined to think of Google as a hardware company than we do today.

In spite of the launch of the Nexus Player, the acquisition of Nest, the continued success of the Chromecast, beating Apple to market with Android-powered smartwatches and a new pair of Nexus phone and tablet devices – not to mention the self-driving cars – it can’t realistically be claimed that people think of Google as a hardware company today. Certainly the company has more involvement in physical hardware than it ever has, but by and large the company’s perception is shaped by its services: Search, AdSense/Words, Gmail, GCE etc. That might have shifted somewhat if the Nest brand had been folded into Google’s and the company had released additional device types, but that’s merely speculation.

The fact is that Google is not materially more of a hardware company today than it was when these predictions were made. Ergo, this is a miss.


Google Will Acquire IFTTT

Acquisitions are always difficult to predict, because of the number of variables involved. But let’s say, for the sake of argument, that you a) buy the prediction that a major problem with the IoT is compatibility and b) that you believe Google’s becoming more of a hardware company broadly and IoT company over time: what’s the logical next step if you’re Google? Maybe you contemplate the acquisition of a Belkin or similar, but more likely you (correctly) decide the company has quite enough to digest at the moment in the way of hardware acquisitions. But what about IFTTT?

By more closely marrying the service to their collaboration tools, Google could a) differentiate same, b) begin acclimating consumers to IoT-style interconnectivity, and c) begin generating even more data about consumer habits to feed their existing (and primary) revenue stream, advertising.

Not much argument here, as IFTTT was not acquired by anyone, Google included. The logic behind the prediction remains sound, but there’s no way to count this as anything other than a miss.

The Final Tally

To wrap things up, how did the above predictions score? The short answer is not well. Out of the ten predictions for the year, five were correct. Which means, unfortunately, that five were not, good for a dismal 50% average. In the now five years of this exercise, 50% is the lowest score ever, and the lowest since last year, which saw the debut of the new more aggressive format – which is obviously not a coincidence.

In my defense, however, the misses were primarily drawn from the least certain predictions; all of the “Safe” predictions, for example, were hits. In terms of scoring, then, the context is important. The failure rate of predictions is highly correlated to their difficulty. It’s simpler, obviously, to predict acquisitions in a given category than to predict a specific acquirer/acquiree match.

All of that said, the forthcoming predictions for 2015 will remain aggressive in nature, even if that means 2016 will see a similarly contrite and humble predictions wrap up.

Categories: Cloud, Hardware, IoT, Network, Storage.

The RedMonk Programming Language Rankings: January 2015

Update: These rankings have been updated. The third quarter snapshot is available here.

With two quarters having passed since our last snapshot, it’s time to update our programming language rankings. Since Drew Conway and John Myles White originally performed this analysis late in 2010, we have been regularly comparing the relative performance of programming languages on GitHub and Stack Overflow. The idea is not to offer a statistically valid representation of current usage, but rather to correlate language discussion (Stack Overflow) and usage (GitHub) in an effort to extract insights into potential future adoption trends.

In general, the process has changed little over the years. With the exception of GitHub’s decision to no longer provide language rankings on its Explore page – they are now calculated from the GitHub archive – the rankings are performed in the same manner, meaning that we can compare rankings from run to run, and year to year, with confidence.

This is brought up because one result in particular, described below, is very unusual. But in the meantime, it’s worth noting that the steady decline in correlation between rankings on GitHub and Stack Overlow observed over the last several iterations of this exercise has been arrested, at least for one quarter. After dropping from its historical .78 – .8 correlation to .74 during the Q314 rankings, the correlation between the two properties is back up to .76. It will be interesting to observe whether this is a temporary reprieve, or if the lack of correlation itself was the anomaly.

For the time being, however, the focus will remain on the current rankings. Before we continue, please keep in mind the usual caveats.

  • To be included in this analysis, a language must be observable within both GitHub and Stack Overflow.
  • No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
  • There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
  • All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
  • GitHub language rankings are based on raw lines of code, which means that repositories written in a given language that include a greater number amount of code in a second language (e.g. JavaScript) will be read as the latter rather than the former.
  • In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.

(click to embiggen the chart)

Besides the above plot, which can be difficult to parse even at full size, we offer the following numerical rankings. As will be observed, this run produced several ties which are reflected below (they are listed out here alphabetically rather than consolidated as ties because the latter approach led to misunderstandings).

1 JavaScript
2 Java
4 Python
5 C#
5 C++
5 Ruby
9 C
10 Objective-C
11 Perl
11 Shell
13 R
14 Scala
15 Haskell
16 Matlab
17 Go
17 Visual Basic
19 Clojure
19 Groovy

By the narrowest of margins, JavaScript edged Java for the top spot in the rankings, but as always, the difference between the two is so marginal as to be insignificant. The most important takeaway is that the language frequently written off for dead and the language sometimes touted as the future have shown sustained growth and traction and remain, according to this measure, the most popular offerings.

Outside of that change, the Top 10 was effectively static. C++ and Ruby jumped each one spot to split fifth place with C#, but that minimal distinction reflects the lack of movement of the rest of the “Tier 1,” or top grouping of languages. PHP has not shown the ability to unseat either Java or JavaScript, but it has remained unassailable for its part in the third position. After a brief drop in Q1 of 2014, Python has been stable in the fourth spot, and the rest of the Top 10 looks much as it has for several quarters.

Further down in the rankings, however, there are several trends worth noting – one in particular.

  • R: Advocates of the language have been pleased by four consecutive gains in these rankings, but this quarter’s snapshot showed R instead holding steady at 13. This was predictable, however, given that the languages remaining ahead of it – from Java and JavaScript at the top of the rankings to Shell and Perl just ahead – are more general purpose and thus likely to be more widely used. Even if R’s grow does stall at 13, however, it will remain the most popular statistical language by this measure, and this in spite of substantial competition from general purpose alternatives like Python.

  • Go: In our last rankings, it was predicted based on its trajectory that Go would become a Top 20 language within six to twelve months. Six months following that, Go can consider that mission accomplished. In this iteration of the rankings, Go leapfrogs Visual Basic, Clojure and Groovy – and displaces Coffeescript entirely – to take number 17 on the list. Again, we caution against placing too much weight on the actual numerical position, because the differences between one spot and another can be slight, but there’s no arguing with the trendline behind Go. While the language has its critics, its growth prospects appear secure. And should the Android support in 1.4 mature, Go’s path to becoming a Top 10 if not Top 5 language would be clear.

  • Julia/Rust: Long two of the notable languages to watch, Julia and Rust’s growth has typically been in lockstep, though not for any particular functional reason. This time around, however, Rust outpaced Julia, jumping eight spots to 50 against Julia’s more steady progression from 57 to 56. It’s not clear what’s responsible for the differential growth, or more specifically if it’s problems with Julia, progress from Rust (with a DTrace probe, even), or both. But while both remain languages of interest, this ranking suggests that Rust might be poised to outpace its counterpart.

  • Coffeescript: As mentioned above, Coffeescript dropped out of the Top 20 languages for the first time in almost two years, and may have peaked. From its high ranking of 17 in Q3 of 2013, in the three runs since, it has clocked in at 18, 18 and now 21. The “little language that compiles into JavaScript” positioned itself as a compromise between JavaScript’s ubiquity and syntactical eccentricities, but support for it appears to be slowly eroding. How it performs in the third quarter rankings should provide more insight into whether this is a temporary dip or more permanent decline.

  • Swift: Last, there is the curious case of Swift. During our last rankings, Swift was listed as the language to watch – an obvious choice given its status as the Apple-anointed successor to the #10 language on our list, Objective-C. Being officially sanctioned as the future standard for iOS applications everywhere was obviously going to lead to growth. As was said during the Q3 rankings which marked its debut, “Swift is a language that is going to be a lot more popular, and very soon.” Even so, the growth that Swift experienced is essentially unprecedented in the history of these rankings. When we see dramatic growth from a language it typically has jumped somewhere between 5 and 10 spots, and the closer the language gets to the Top 20 or within it, the more difficult growth is to come by. And yet Swift has gone from our 68th ranked language during Q3 to number 22 this quarter, a jump of 46 spots. From its position far down on the board, Swift now finds itself one spot behind Coffeescript and just ahead of Lua. As the plot suggests, Swift’s growth is more obvious on StackOverflow than GitHub, where the most active Swift repositories are either educational or infrastructure in nature, but even so the growth has been remarkable. Given this dramatic ascension, it seems reasonable to expect that the Q3 rankings this year will see Swift as a Top 20 language.

The Net

Swift’s meteoric growth notwithstanding, the high level takeaway from these rankings is stability. The inertia of the Top 10 remains substantial, and what change there is in the back half of the Top 20 or just outside of it – from Go to Swift – is both predictable and expected. The picture these rankings paint is of an environment thoroughly driven by developers; rather than seeing a heavy concentration around one or two languages as has been an aspiration in the past, we’re seeing a heavy distribution amongst a larger number of top tier languages followed by a long tail of more specialized usage. With the exceptions mentioned above, then, there is little reason to expect dramatic change moving forward.

Update: The above language plot chart was based on an incorrect Stack Overflow tag for Common Lisp and thereby failed to incorporate existing activity on that site. This has been corrected.

Categories: Programming Languages.

DVCS and Git Usage in 2014

To many in the technology industry, the dominance of Decentralized Version Control Systems (DVCS) generally and Git specifically is taken as a given. Whether it’s consumed as a product (e.g. GitHub Enterprise/Stash), service (Bitbucket, GitHub) or base project, Git is the de facto winner in the DVCS category, a category which has taken considerable share from its centralized alternatives over the past few years. With macro trends fueling further adoption, it’s natural to expect that the ascent of Git would continue unimpeded.

One datapoint which has proven useful for assessing the relative performance of version control systems is Open Hub (formerly Ohloh)’s repository data. Built to index public repositories, it gives us insight into the respective usage at least within its broad dataset. In 2010 when we first examined its data, Open Hub was crawling some 238,000 projects, and Git managed just 11% of them. For this year’s snapshot, that number has swelled to over 674,000 – or close to 3X as many. And Git’s playing a much more significant role today than it did then.

Before we get into the findings, more details on the source and issues.


The data in this chart was taken from snapshots of the Open Hub data exposed here.

Objections & Responses

  • Open Hub data cannot be considered representative of the wider distribution of version control systems“: This is true, and no claims are made here otherwise. While it necessarily omits enterprise adoption, however, it is believed here that Open Hub’s dataset is more likely to be predictive moving forward than a wider sample.
  • Many of the projects Open Hub surveys are dormant“: This is probably true. But even granting a sizable number of dormant projects, it’s expected that these will be offset by a sizable influx of new projects.
  • Open Hub’s sampling has evolved over the years, and now includes repositories and forges it did not previously“: Also true. It also, by definition, includes new projects over time. When we first examined the data, Open Hub surveyed less than 300,000 projects. Today it’s over 600,000. This is a natural evolution of the survey population, one that’s inclusive of evolving developer behaviors.

With those caveats in mind, let’s start with the big picture. The following chart depicts the total share of repositories attributable to centralized (CVS/Subversion) and distributed (Bazaar/Git/Mercurial) systems.

Even over a brief three year period (we lack data for 2011, and have thus omitted 2010 for continuity’s sake) it’s clear that DVCS systems have made substantial inroads. DVCS may not be quite as dominant as is commonly assumed, but it’s close to managing one in two projects in the world. When considering the inertial effects operating against DVCS, this traction is impressive. In spite of the fact that it can be difficult even for excellent developers to shift their mental model from centralized to decentralized, that version control systems are not typically the priority of other infrastructure elements, that the risks associated with moving from one system to another are non-trivial, DVCS has clearly established itself as a popular, mainstream option. Close observation of the above chart, however, reveals a slight hiccup in adoption numbers which we’ll explore in more detail shortly.

In the meantime, let’s isolate the specific changes per project between our 2014 snapshot and the 2010 equivalent. How has their relative share changed?

As might be predicted, comparing 2010 to 2014, Git is the clear winner. The project with the idiosyncratic syntax made substantial gains (25.92%) partially at the expense of Subversion (-12.02%) but more CVS (-16.64%). Just as clearly, Git is the flag bearer for DVCS more broadly, as other decentralized version control systems in Bazaar and Mercurial showed only modest improvement over that span – 1.33% and 1.41% respectively. The takeaways, then, from this span are first that DVCS is a legitimate first class citizen and second that Git is the most popular option in that category.

What about the past year, however? Has Git continued on its growth trajectory?

The short answer is no. With this chart, it’s very important to note the scale of the Y axis: the changes reflected here are comparatively minimal, which is to be expected over the brief span of one year. That being said, it’s interesting to observe that Subversion shows a minor bounce (1.28%), while Git (-1.17%) took a correspondingly minor step back. Bazaar and CVS were down negligible amounts over the same span, while Mercurial was ever so slightly up.

Neither quantitative nor qualitative evidence supports the idea that Git adoption is stalled, nor that Subversion is poised for a major comeback. Wider market product trends, if anything, contradict the above, and suggest that the most likely explanation for the delta in Open Hub’s numbers is the addition of major new centrally managed codebases to Open Hub’s index.

It does serve as a reminder, however, that as much as the industry takes it for granted that Git is the de facto standard for version control systems, a sizable volume of projects have yet to migrate to a decentralized system of any kind. The implications for this are many. For service providers who are Git-centric, it may be worth considering creating bridges for users on other systems or even offering assistance in VCS migrations. For DVCS providers, the above may be superficially discouraging, but in reality indicates that the market opportunity is even wider than commonly assumed. And for users, it means that those still on centralized systems should consider migrating to decentralized alternatives, but by no means are condemned to the laggard category.

While it is thus assumed here, however, that the step back for Git is an artifact, it will be interesting to watch the growth of the platform over the next year. One year’s lack of growth is easily dismissed as an anomaly; a second year would be more indicative of a pattern. It will be interesting to see what the 2015 snapshot tells us.

Disclosure: Black Duck, the parent company of Open Hub, has been a RedMonk customer but is not currently.

Categories: Version Control.

The Scale Imperative

The Computing Scale Co

Once upon a time, the larger the workload, the larger the machine you would use to service it. Companies from IBM to Sun supplied enormous hardware packages to customers with similarly outsized workloads. IBM, in fact, still generates substantial revenue from its mainframe hardware business. One of the under-appreciated aspects of Sun’s demise, on the other hand, was that it had nothing to do with a failure of its open source strategy; the company’s fate was sealed instead by the collapse in sales of its E10K line, due in part to the financial crisis. For vendors and customers alike, mainframe-class hardware was the epitome of computational power.

With the rise of the internet, however, this model proved less than scalable. Companies founded in the late 1990’s like Google, whose mission was to index the entire internet, looked at the numbers and correctly concluded that the economics of that mission on a scale-up model were untenable. With scale-up an effective dead end, the remaining option was to scale-out. Instead of big machines, scale-out players would build software that turned lots of small machines into bigger machines, e pluribus unum writ in hardware. By harnessing the collective power of large numbers of low cost, comparatively low power commodity boxes the scale-out pioneers could scale to workloads of previously unimagined size.

This model was so successful, in fact, that over time it came to displace scale-up as the default. Today, the overwhelming majority of companies scaling their compute requirements are following in Amazon, Facebook and Google’s footprints and choosing to scale-out. Whether they’re assembling their own low cost commodity infrastructure or out-sourcing that task to public cloud suppliers, infrastructure today is distributed by default.

For all of the benefits of this approach, however, the power afforded by scale-out did not come without a cost. The power of distributed systems mandates fundamental changes in the way that infrastructure is designed, built and leveraged.

Sharing the Collective Burden of Software

The most basic illustration of the cost of scale-out is the software designed to run on it. As Joe Gregorio articulated seven years ago:

The problem with current data storage systems, with rare exception, is that they are all “one box native” applications, i.e. from a world where N = 1. From Berkeley DB to MySQL, they were all designed initially to sit on one box. Even after several years of dealing with MegaData you still see painful stories like what the YouTube guys went through as they scaled up. All of this stems from an N = 1 mentality.

Anything designed prior to the distributed system default, then, had to be retrofit – if possible – to not just run across multiple machines instead of a single node, but to run well and take advantage of their collective resources. In many cases, it proved simpler to simply start from scratch. The Google Filesystem and HDFS papers that resulted in Hadoop are one example of this; at its core, the first iterations of the project were designed to deconstruct a given task into multiple component tasks to be more easily executed by an array of machines.

From the macro-perspective, besides the inherent computer science challenges of (re)writing software for distributed, scale-out systems – which is exceptionally difficult – the economics were problematic. With so many businesses moving to this model in a relatively short span of time, a great deal of software needed to get written quickly.

Because no single player could bear the entire financial burden, it became necessary to amortize the costs across an industry. Most of the infrastructure we take for granted today, then was developed as open source. Linux became an increasingly popular operating system choice as both host and guest; the project, according to Ohloh, is the product of over 5500 person-years in development. To put that number into context, if you could somehow find and hire 1,000 people high quality kernel engineers, and they worked 40 hours a week with two weeks vacation, it would take you 24 years to match that effort. Even Hadoop, a project that hasn’t had its 10 year anniversary yet, has seen 430 person-years committed. The even younger OpenStack, a very precocious four years old, has seen an industry conglomerate collectively contribute 594 years of effort to get the project to where it is today.

Any one of these projects could be singularly created by a given entity; indeed, this is common, in fact. Just in the database space, whether it’s Amazon with DynamoDB, Facebook with Cassandra or Google with BigQuery, each scale-out player has the ability to generate its own software. But this is only possible because they are able to build upon the available and growing foundation of open source projects, where the collective burden of software is shared. Without these pooled investments and resources, each player would have to either build or purchase at a premium everything from the bare metal up.

Scale-out, in other words, requires open source to survive.

Relentless Economies of Scale

In stark contrast to the difficulty of writing software for distributed systems, microeconomic principles love them. The economies of scale that larger players can bring to bear on the markets they target are, quite frankly, daunting. Their variable costs decrease due to their ability to purchase in larger quantities; their fixed costs are amortized over a higher volume customer base; their relative efficiency can increase as scale drives automation and improved processes; their ability to attract and retain talent increases in proportion to the difficulty of the technical challenges imposed; and so on.

If it’s difficult to quantify these advantages in precise terms, but we can at least attempt to measure the scale at which various parties are investing. Specifically, we can examine their reported plant, property and equipment investments.

If one accepts the hypothesis that economies of scale will play a significant role in determining who is competitive and who is not, this chart suggests that the number of competitive players in the cloud market will not be large. Consider that Facebook, for all of its heft and resources, is a distant fourth in terms of its infrastructure investments. This remains true, importantly, even if their spend was adjusted upwards to offset the reported savings from their Open Compute program.

Much as in the consumer electronics world, then, where Apple and Samsung are able to leverage substantial economies of scale in their mobile device production – an enormous factor in Apple’s ability to extract outsized and unmatched margins – so too is the market for scale-out likely to be dominated by the players that can realize the benefits of their scale most efficiently.

The Return of Vertical Integration

Pre-internet, the economics of designing your own hardware were less than compelling. In the absence of a global worldwide network, not to mention less connected populations, even the largest companies were content to outsource the majority of their technology business, and particularly hardware, to specialized suppliers. Scale, however, challenges those economics on a fundamental level, and forced those at the bleeding edge to rethink traditional infrastructure design, questioning all prior assumptions.

It’s long been known, for example, that Google eschewed purchasing hardware from traditional suppliers like Dell, HP or IBM in favor of its own designs manufactured by original device manufacturers (ODMs); Stephen Shankland had an in depth look at one of their internal designs in 2009. Even then, the implications of scale are apparent; it seems odd, for example, to embed batteries in the server design, but at scale, the design is “much cheaper than huge centralized UPS,” according to Ben Jai. But servers were only the beginning.

As it turns out, networking at scale is an even greater challenge than compute. On November 14th, Facebook provided details on its next generation data center network. According to the company:

The amount of traffic from Facebook to Internet – we call it “machine to user” traffic – is large and ever increasing, as more people connect and as we create new products and services. However, this type of traffic is only the tip of the iceberg. What happens inside the Facebook data centers – “machine to machine” traffic – is several orders of magnitude larger than what goes out to the Internet…

We are constantly optimizing internal application efficiency, but nonetheless the rate of our machine-to-machine traffic growth remains exponential, and the volume has been doubling at an interval of less than a year.

As of October 2013, Facebook was reporting 1.19B active monthly users. Since that time, then, machine to machine east/west networking traffic has more than doubled. Which makes it easy to understand how the company might feel compelled to reconsider traditional networking approaches, even if it means starting effectively from scratch.

Earlier that week at its re:Invent conference, meanwhile, Amazon went even further, offering an unprecedented peek behind the curtain. According to James Hamilton, Amazon’s Chief Architect, there are very few remaining aspects to AWS which are not designed internally. The company has obviously dramatically grown the software capabilities of its platform over time: on top of basic storage and compute, Amazon has integrated an enormous variety of previously distinct services: relational databases, a Map Reduce engine, data warehousing and analytical capabilities, DNS and routing, CDN, a key value store, a streaming platform – and most recently ALM tooling, a container service and a real-time service platform.

But the tendency of software platforms to absorb popular features is not atypical. What is much less common is the depth to which Amazon has embraced hardware design.

  • Amazon now builds their own networking gear running their own protocol. The company claims their gear is lower cost, faster and that the cycle time for bugs is reduced from months to weekly.
  • Amazon’s server and storage designs are custom to the vendor; the storage servers, for example, are optimized for density and pack in 864 disks at a weight of almost 2400 pounds.
  • Intel is now working directly with Amazon to produce custom chip designs, capable of bursting to much higher clock speeds temporarily.
  • To ensure adequate power for its datacenters, Amazon has progressed beyond simple negotiated agreements with power suppliers to building out custom substations, driven by custom switchgear the company itself designed.

Compute, networking, storage, power: where does this internal innovation path end? In Hamilton’s words, there is no category of hardware that is off-limits for the company. But the relentless in-sourcing is not driven by religious objections – such considerations are strictly functions of cost.

In economic terms, of course, this is an approximation of backward vertical integration. Amazon may not own the manufacturers themselves as in traditional vertical integration, but manufacturing is an afterthought next to the original design. By creating their own infrastructure from scratch, they avoid paying an innovation tax to third party manufacturers, can build strictly to their specifications and need only account for their own needs – not the requirements of every other potential vendor customer. The result is hardware that is, in theory at least, more performant, better suited to AWS requirements and lower cost.

While Amazon or Facebook have provided us with the most specifics, then, it’s safe to assume that vertical integration is a pattern that is already widespread amongst larger players and will only become more so.

The Net

For those without hardware or platform ambitions, the current technical direction is promising. With economies of scale growing ever larger and gradual reduction of third party suppliers continuing, cloud platform providers would appear to have margin yet to trim. And at least to date, competition on cloud platforms (IaaS, at least) has been sufficient to keep vendors from pocketing the difference, with industry pricing still on a downward trajectory. Cloud’s pricing advantage historically was the ability to pay less upfront and more over the longer term, but with base prices down close to 100% over a two year period, the longer term premium attached to cloud may gradually decline to the point of irrelevance.

On the software front, an enormous portfolio of high quality, highly valuable software that would have been financially out of the reach of small and even mid-sized firms even a few years ago is available today at no cost. Virtually any category of infrastructure software today – from the virtualization layer to the OS to the runtime to the database to the cloud middleware equivalents – has high quality, open source options available. And for those willing to pay a premium to outsource the operational responsibilities of building, deploying and maintaining this open source infrastructure, any number of third party platform providers would be more than happy to take those dollars.

For startups and other non-platform players, then, the combination of hardware costs amortized by scale and software costs distributed across a multitude of third parties means that effort can be directed towards business problems rather than basic, operational infrastructure.

The cloud platform players, meanwhile, symbiotically benefit from these transactions, in that each startup, government or business that chooses their platform means both additional revenue and a gain in scale that directly, if incrementally, drives down their costs (economies of scale) and indirectly increases their incentive and ability to reduce their own costs via vertical integration. The virtuous cycle of more customers leading to more scale leading to lower costs leading to lower prices leading to more customers is difficult to disrupt. This is in part why companies like Amazon or Salesforce are more than willing to trade profits for growth; scale may not be a zero sum game, but growth today will be easier to purchase than growth tomorrow – yet another reason to fear Amazon.

The most troubling implications of scale, meanwhile, are for traditional hardware suppliers (compute/storage/networking) and would-be cloud platform service providers. The former, obviously are substantially challenged by the ongoing insourcing of hardware design. Compute may have been first, with Dell being forced to go private, HP struggling with its x86 business and IBM being forced to exit the commodity server business entirely. But it certainly won’t be the last. Networking and storage players alike are or should be preparing for the same disruption server manufacturers have experienced. The problem is not that cloud providers will absorb all or even the majority of the networking and storage addressable markets; the problem is that it will absorb enough to negatively impact the scale traditional suppliers can operate at.

Those that would compete with Amazon, Google, Microsoft et al, meanwhile, or even HP or IBM’s offerings in the space, will find themselves faced with increasingly higher costs relative to larger competition, whether it’s from premiums paid to various hardware suppliers, lower relative purchasing power or both. Which implies several things. First, that such businesses must differentiate themselves quickly and clearly, offering something larger, more cost-competitive players are either unable or unwilling to. Second, that their addressable market as a result of this specialization will be a fraction of the overall opportunity. And third, that the pool of competitors for base level cloud platform services will be relatively small.

What the long term future holds should these predictions hold up and the market come to be dominated by a few larger players is less clear, because as ever in this industry, their disruptors are probably already making plans in a garage somewhere.

Disclosure: Amazon, Dell, HP, IBM and Microsoft are RedMonk clients. Facebook and Google are not.

Categories: Cloud.

The Implications of IaaS Pricing Patterns and Trends

With Amazon’s re:Invent conference a week behind us and any potential price cuts or responses presumably implemented by this point, it’s time to revisit the question of infrastructure as a service pricing. Given what’s at stake in the cloud market, competition amongst providers continues to be fierce, driving costs for customers ever lower in what some observers have negatively characterized as a race to the bottom.

While the downward pricing pressure is welcome, however, it can be difficult to properly assess how competitive individual providers are with one another, all the more so because their non-standardized packaging makes it effectively impossible to compare service to service on an equal footing.

To this end we offer the following deconstruction of IaaS cloud pricing models. As a reminder, this analysis is intended not as a literal expression of cost per service; this is not, in other words, an attempt to estimate the actual component costs for compute, disk, and memory per provider. Such numbers would be speculative and unreliable, relying as they would on non-public information, but also of limited utility for users. Instead, this analysis compares base hourly instance costs against the individual service offerings. What this attempts to highlight is how providers may be differentiating from each other – deliberately or otherwise – by offering more memory per dollar spent, as one example. In other words, it’s an attempt to answer the question: for a given hourly cost, who’s offering the most compute, disk or memory?

As with previous iterations, a link to the aggregated dataset is provided below, both for fact checking and to enable others to perform their own analyses, expand the scope of surveyed providers or both.

Before we continue, a few notes.


  • No special pricing programs (beta, etc)
  • Linux operating system, no OS premium
  • Charts are based on price per hour costs (i.e. no reserved instances)
  • Standard packages only considered (i.e. no high memory, etc)
  • Where not otherwise specified, the number of virtual cores is assumed to equal to available compute units

Objections & Responses

  • This isn’t an apples to apples comparison“: This is true. The providers do not make that possible.
  • These are list prices – many customers don’t pay list prices“: This is also true. Many customers do, however. But in general, take this for what it’s worth as an evaluation of posted list prices.
  • This does not take bandwidth and other costs into account“: Correct, this analysis is server only – no bandwidth or storage costs are included. Those will be examined in a future update.
  • This survey doesn’t include [provider X]“: The link to the dataset is below. You are encouraged to fork it.

Other Notes

  • HP’s 4XL (60 cores) and 8XL (103 cores) instances were omitted from this survey intentionally for being twice as large and better than three times as large, respectively, as the next largest instances. While we can’t compare apples to apples, those instances were considered outliers in this sample. Feel free to add them back and re-run using the dataset below.
  • While we’ve had numerous requests to add providers, and will undoubtedly add some in future, the original dataset – with the above exception – has been maintained for the sake of comparison.

How to Read the Charts

  • There was some confusion last time concerning the charts and how they should be read. The simplest explanation is that the steeper the slope, the better the pricing from a user perspective. The more quickly cores, disk and memory are added relative to cost, the less a user has to pay for a given asset.

With that, here is the chart depicting the cost of disk space relative to the price per hour.

(click to embiggen)

This chart is notable primarily for two trends: first, the aggressive top line Amazon result and second, the Joyent outperformance. The latter is an understandable pricing decision: given Joyent’s recent market focus on data related workloads and tooling, e.g. the recently open sourced Manta, Joyent’s discounting of storage costs is logical. Amazon’s divergent pattern here can be understood as two separate product lines. The upper points represent traditional disk based storage (m1), which Amazon prices aggressively relative to the market, while the bottom line represents its m3 or SSD based product line, which is more costly – although still less pricy than alternative packages from IBM and Microsoft. Google does not list storage in its base pricing and is thus omitted here.

The above notwithstanding, a look at the storage costs on a per provider basis would indicate that for many if not most providers, storage is not a primary focus, at least from a differentiation standpoint.

(click to embiggen)

As has historically been the case, the correlation between providers in the context of memory per dollar is high. Google and Digital Ocean are most aggressive with their memory pricing, offering slightly more memory per dollar spent than Amazon. Joyent follows closely after Amazon, and then comes Microsoft, HP and IBM in varying order.

Interestingly, when asked at the Google Cloud Live Platform event whether the company had deliberately turned the dial in favor of cheaper memory pricing for their offerings as a means of differentiation and developer recruitment, the answer was no. According to Google, any specific or distinct improvements on a per category basis – memory, compute, etc – are arbitrary, as the company seeks to lower the overall cost of their offering based on improved efficiencies, economies of scale and so on rather than deliberately targeting areas developers might prioritize in their own application development process.

Whatever their origin, however, developers looking to maximize their memory footprint per dollar spent may be interested in the above as a guide towards separating services from one another.

(click to embiggen)

In terms of computing units per dollar, Google has made progress since the last iteration of this analysis, where it was a bottom third performer. Today, the company enjoys a narrow lead over Amazon, followed closely by HP and Digital Ocean. IBM, Joyent and Microsoft, meanwhile, round out the offerings here.

It is interesting to note the wider distribution within computing units versus memory, as one example. Where there is comparatively minimal separation between providers with regard to memory per dollar, there are relatively substantive deltas between providers in terms of computing power per package. It isn’t clear that this has any material impact on selection or buying preferences at present, but for compute intensive workloads in particular it is at least worth investigating.

IaaS Price History and Implications

Besides taking apart the base infrastructure pricing on a component basis, one common area of inquiry is how provider prices have changed over time. It is enormously difficult to capture changes across services on a comparative basis over time, for many of the reasons mentioned above.

That being said, as many have inquired on the subject, below is a rough depiction of the pricing trends on a provider by provider basis. In addition to the caveats at the top of this piece, it is necessary to note that the below chart attempts to track only services that have been offered from the initial snapshot moving forward so as to be as consistent as possible. Larger instances recently introduced are not included, therefore, and other recent additions such as Amazon’s m3 SSD-backed package are likewise omitted.

Just as importantly, services cannot be reasonably compared to one another here because their available packages and the attached pricing vary widely; some services included more performant, higher cost offerings initially, and others did not. Comparing the average prices of one to another, therefore, is a futile exercise.

The point of the following chart is instead to try and understand price changes on a per provider basis over time. Nothing more, and nothing less.

(click to embiggen)

Unsurprisingly, the overall trajectory for nearly all providers is down. And the exception – Microsoft – appears to spike only because its base offerings today are far more robust than their historical equivalents. The average price drop for the base level services included in this survey from the initial 2012 snapshot to today was 95%: what might have cost $0.35 – $0.70 an hour in 2012 is more likely to cost $0.10 – $0.30 today. Which raises many qustions, the most common of which is to what degree the above general trend is sustainable: is this a race to a bottom, or are we nearing a pricing floor?

While we are far from having a definitive answer on the subject, early signs point to the latter. In the week preceding Amazon’s re:Invent, Google announced across the board price cuts to varying services, on top of an October 10% price cut. A week later, the fact that Amazon did not feel compelled to respond was the subject of much conversation.

One interpretation of this lack of urgency is that it’s simply a function of Amazon’s dominant role in the market. And to be sure, Amazon is in its own class from an adoption standpoint. The company’s frantic pace of releases, however – 280 in 2013, on pace for 500 this year – suggests a longer term play. The above charts describe pricing trends in one of the most basic elements of cloud infrastructure: compute. They suggest that at present, Amazon is content to be competitive – but is not intent on being the lowest cost supplier.

By keeping pricing low enough to prevent it from being a real impediment to adoption, while growing its service portfolio at a rapid pace, Amazon is able to get customers in the door with minimal friction and upsell them on services that are both much less price sensitive than base infrastructure as well as being stickier. In other words, instead of a race to the bottom, the points of price differentiation articulated by the above charts may be less relevant over time, as costs approach true commodity levels – a de facto floor – and customer attention begins to turn to time savings (higher end services) over capital savings (low prices) as a means of cost reduction.

If this hypothesis is correct, Amazon’s price per category should fall back towards the middle ground over time. If Amazon keeps pace, however, it may very well be a race to the bottom. Either way, it should show up in the charts here.

Disclosure: Amazon, HP, IBM, Microsoft and Rackspace are RedMonk customers. Digital Ocean, Google and Joyent are not.

Link: Here is a link to the dataset used in the above analysis.

Categories: Cloud.

What are the Most Popular Open Source Licenses Today?

For a variety of reasons, not least of which is that fewer people seem to care anymore, it’s been some time since we looked at the popularity of open source licenses. Once one of the more common inquiries we fielded, questions about the relative merits or distribution of licenses have faded as we see both consolidation around choices and increased understanding of the practical implications of various licensing styles. Given the recent affinity for permissive licensing, however, amongst major open source projects such as Cloud Foundry, Docker, Hadoop, Node.js or OpenStack, it’s worth revisiting the question of license choices.

Before we get into the question of how licensing choices have changed, it’s necessary to establish a baseline number for distribution today. While it cannot be considered definitive, Black Duck’s visibility into a wide variety of open source repositories and forges serves as a useful sample size. Based on the Black Duck data, then, the following chart depicts the distribution of usage amongst the ten most popular open source licenses.

(click to embiggen)

Moving left to right, from less popular licenses to the most popular, it is easy to determine the overall winner. As has historically been the case, the free software, copyleft GPLv2 is the most popular license choice according to Black Duck. Besides high profile projects such as Linux or MySQL, the GPL has been the overwhemingly most selected license for years. The last time we examined the Black Duck data in 2012, in fact, the GPL was more popular than the MIT, Artistic, BSD, Apache, MPL and EPL put together.

Popular as the GPL remains, however, it no longer enjoys that kind of advantage. If we group both versions (2 and 3) of the GPL together, the GPL is in use within 37% of the Black Duck surveyed projects. The three primary permissive license choices (Apache/BSD/MIT), on the other hand, collectively are employed by 42%. They represent, in fact, three of the five most popular licenses in use today.

License selection has clearly changed, then, but by how much? For comparison’s sake, here’s a chart of the percent change in license usage from this month’s snapshot of Black Duck’s data versus one from 2009.

(click to embiggen)

As we can see, the biggest loser in terms of share was the GPLv2 and, to a lesser extent, the LGPLv2.1. The decline in usage of the GPLv2 can to some degree be attributed to copyleft license fans choosing instead the updated GPLv3; that license, released in 2007, gained about 6% share from 2009 to 2014. But with usage of version 2 down by about 24%, the update is clearly not the only reason for decreased usage of the GPL.

Instead, the largest single contributing factor to the decline of the GPL’s dominance – it’s worth reiterating, however, that it remains the most popular surveyed license – is the rise of permissive licenses. The two biggest gainers on the above chart, the Apache and MIT licenses, were collectively up 27%. With the BSD license up 1%, the three most popular permissive licenses are collectively up nearly 30% in the aggregate.

While this shift will surprise some, and suggests that much like the high profile of projects like Linux and MySQL led to wider adoption of reciprocal or copyleft-style licenses, Hadoop and others are leaving a sea of permissively licensed projects in their wake.

But the truth is that a correction of some sort was likely inevitable. The heavily skewed distribution towards copyleft licenses was always somewhat unnatural, and therefore less than sustainable over time. What will be interesting to observe moving forward is whether these trends continue, or whether further corrections are in store. Currently, license preferences seem to be accumulating at either ends of the licensing spectrum (reciprocal or permissive); the middle ground in file-based licenses such as the LGPL/MPL remain a relatively distant third category in popularity. Will MPL-licensed projects like the recently opened Manta or SmartDataCenter change that, or are they outliers?

Whatever the outcome, it’s clear we should expect greater diversity amongst licensing choices than we’ve seen in the past. The days of having a single dominant license are, for all practical purposes, over.

Disclosure: Black Duck, the source of this data, has been a RedMonk client but is not currently.

Categories: licensing, Open Source.

Model vs Execution

One of the things that we forget today about SaaS is that we tried it before, and it failed. Coined sometime in 1999 if Google is to be believed, the term “Application Service Provider” (ASP) was applied to companies that delivered software and services over a network connection – what we today commonly call SaaS. By and large this market failed to gain significant traction. Accounts differ as to how and when a) SaaS was coined (IT Toolbox claims it was coined in 2005 by John Koenig) and b) replaced ASP as the term of choice but the fact that ASP could be replaced at all is an indication of its lack of success. While various web based businesses from that period are not only still with us, but in Amazon and Google among the largest in the world, those attempting to sell software via the web rather than deploying it on premise generally did not survive.

A decade plus later, however, and not only has the SaaS model itself survived, but it is increasingly the default approach. The point here isn’t to examine the mechanics of the SaaS business, however; we’ve done that previously (see here or here, for example). The point of bringing up SaaS here, rather, is to serve as a reminder that there’s a difference between model and execution.

Too often in this industry, we look upon a market failure as a permanent indictment of potential. If it didn’t work once, it will never work.

The list of technologies that have been dismissed because they initially failed or seemed unimpressive is long: virtualization was widely regarded as a toy, it’s now an enterprise standard. Smart people once looked at containers and said “neato, why would you want to do that?” Two plus years after Amazon’s creation of the cloud market, then Microsoft CTO Ray Ozzie admitted that cloud “isn’t being taken seriously right now by anybody except Amazon.” In the wake of the anemic adoption – particularly relative to Amazon’s IaaS alternative – of the first iterations of PaaS market pioneers and Google App Engine, many venture capitalists decided that PaaS was a model with no future. DVCS tools like Git were initially scorned and reviled by developers because they were different on a fundamental level.

In each case, it’s important to separate the model from the execution. Too often, failures of the latter are perceived as a fatal flaw in the former. In the case of PaaS, for example, it’s become obvious that the lack of developer adoption was driven by the initial constraints of the first platforms; not having to worry about scaling was an attractive feature, but not worth the sacrifice of having to develop an application in a proprietary language against a proprietary backend that ensured the application could never be easily ported. Half a dozen years later, PaaS platforms are now not only commonly multi-runtime but open source, and growth is strong.

SaaS, meanwhile, would prove to be an excellent model over time, but initially had to contend with headwinds consisting of inconsistent and asymmetrically available broadband, far more functionally limited browser technologies and a user population both averse to risk and brought up on the literal opposite model. In retrospect, it’s no surprise that the ASP market failed; indeed, it’s almost more surprising that the SaaS market followed so quickly on its heels.

In both cases, the initial failures were not attributable to the models. There is in fact demand for PaaS and SaaS, it was simply that the vendors did not (PaaS) or could not (SaaS) execute properly at the time.

Given the rate and pace of change in the technology industry, it is both necessary and inevitable that new technology and approaches are viewed skeptically. As with most innovation, in the technology world or outside of it, failure is the norm. But critical views notwithstanding, it’s important to try and understand the wider context when evaluating the relative merits of competing models. It may well be that the model itself is simply unworkable. But in an industry where what is old is new again, daily, it is far more likely that a current lack of success is due to a failure of or inability to (due to market factors) execute.

In which case, you may want to give that “failed” market a second look. Opportunity may lie within.

Categories: Business Models.

A Few Suggestions for Briefing Analysts

One of the things that happens when you’re a developer focused analyst firm these days is that you talk to a lot of companies. The conversations analysts have with commercial vendors or developers about their projects are called briefings.

Whether the company or project is large or small, old or new, there are always ways to use our collective time – meaning the analyst’s and the company/developer’s – more efficiently and effectively. Having been doing this analyst thing for a little while now, I have a few ideas on what some of those ways might be and thought I’d share them. For anyone briefing an analyst then, I offer the following hopefully helpful suggestions. Best case they’ll make better use of your time, worst case you make the analyst’s life marginally easier, which probably can’t hurt.

  1. Determine how much time you have up front
    This will tend to vary by analyst firm, and sometimes by analyst. At RedMonk, for example, we limit briefings with non-clients to a half hour, a) because we have to talk to a lot of people and b) because very few people have a problem getting us up to speed in that time. It’s important, however, to be aware of this up front. If you think you have an hour, but only have half that, you might present the materials differently.
  2. Unless you’re solving a unique problem, don’t spend your time covering the problem
    If the analyst you’re speaking with is capable, they already understand it well, so time describing it is effectively wasted time. If there’s some aspect of a given market that you perceive differently and break with the conventional wisdom, by all means explain your unique vision of the world (and expect pushback). But a lot of presentations, possibly because they originated as material for non-analysts, spend time describing a market that everyone on the call likely already understands. Jumping right to how you are different, then, is more productive.
  3. If you’re just delivering slides and they’re not confidential (see #4), do not use web meeting software
    If you need to demo an application, web meeting software is acceptable. If you’re just going over slides that aren’t confidential, skip it. Inevitably the meeting software won’t work for someone; they don’t have the right plugin, a dependency is missing, their connections is poor, etc. The downtime while everyone else is waiting for the one meeting participant to download yet another version of web meeting software they probably don’t want is time that everyone else loses and can never get back. Also, it’s nice for analysts to have slides to refer to later.
  4. Don’t have confidential slides
    If you’re actively engaging with an analyst in something material, a potential acquisition for example, confidential slides are pretty much unavoidable. But if you’re doing a simple product briefing, lose the confidential slides. It makes it more difficult to recall later – particularly if a copy of the slides is not made available – what precisely is confidential, and what is not. Which means that analysts may be reticent to discuss news or information you’d like them to, due to the cognitive overhead of having to remember which 5 slides out 40 were confidential. When it comes time to present confidential material, just note that and walk through it verbally.
  5. If you spend the entire time talking, you may miss out on the opportunity for questions later
    It’s natural to want to talk about your product, and the best briefings are conducted by people with good energy and enthusiasm for what they do. That being said, making sure you leave time for questions can gain you valuable insights into what part of your presentation isn’t clear, and – depending on the analyst/firm – may lead to a two way conversation where you can get some of your own questions answered.
  6. Don’t use the phrase “market leader,” let the market tell us that
    This is perhaps just a pet peeve of mine, but my eyebrows definitely go up when vendors claim to be the “market leader.” This is for a few reasons. First, because genuine market leaders should not have to remind you of that. Second, what is the metric? Analysts may not agree with your particular yardstick. Third, because your rankings may not reflect an analyst’s view of the market, and while disagreement is normal it can sidetrack more productive conversations.
  7. Analysts aren’t press, so treating them that way is a mistake
    While frequently categorized together, analysts and press are in reality very different. Attitudes and towards and incentives regarding embargoes, for example, are entirely distinct. Likewise, many vendors and their PR teams send out “story ideas” to analysts, which is pointless because analysts don’t produce “stories” and are rarely on deadline in the way that the press is. What we tell clients all the time is that our job is not to break news or produce “scoops,” it’s to understand the market. If you treat analysts as press that is trying to extract information from you for that purpose, you may miss the opportunity to have a deeper, occasionally confidential, dialogue with an analyst.
  8. Make sure the analyst covers your space; if you don’t know, just ask
    Every analyst, whether generalist or specialist, will have some area of focus. Before you spend your time and theirs describing your product or service, it’s important to determine whether or not they cover your space at all. Every so often, for example, vendor outreach professionals will see that we cover “developers” and try to schedule a briefing for their bodyshop offering developmental services. Given that we don’t generally cover professional services, this isn’t a good use of anybody’s time. The simplest way of determining whether they cover your category, of course – assuming you can’t determine this from their website, Twitter bio, prior coverage, etc – is to just ask.
  9. Asking for feedback “after the call”
    In general, it seems like a harmless request to make at the end of a productive call: “If you think of any other feedback for us after the call, feel free to send this along.” And in most cases, it is relatively innocuous. Another way of interpreting this request, however, is: “Feel free to spend cycles thinking about us and send along free feedback after we’re done.” So you might consider using this request sparingly.
  10. Don’t ask if we want to receive information: that’s just another email thread
    There are very few people today who don’t already receive more email than they want or can handle. To make everyone’s lives simpler, then, it’s best to skip emails that take the form “Hi – We have an important announcement. Would you like to receive our press release concerning this announcement? If so send us an email indicating that you’ll respect the embargo.” As most analysts will respect embargoes – because we’re not press (see #7) – asking an analyst to reply to an email to get yet another email in return is a waste of an email thread. Your best bet is to maintain a list of trusted contacts, and simply distribute the material to them directly.

Those are just a few that occur off the top of my head based on our day to day work. Do these make sense? Are there other questions, or suggestions from folks in the industry?

Categories: Industry Analysis.

The 2014 Monktoberfest

Last Thursday at ten in the morning, this auditorium was full because I made a joke four years ago.

Describing the Monktoberfest to someone who has never been is difficult. Should we focus on the content, where we prioritize talks about social and tech that don’t have a home at other shows but make you think? Or the logistics, where we try to build a conference that loses the things we don’t enjoy from other conferences? Or maybe the most important thing is the hallway track, which is another way of saying the people?

Whatever else it may be, the Monktoberfest is different. It’s different talks, in a different city, given and discussed by different people. Some of those people are developers with a year or two of experience. Others are founders and CEOs. People helping to decide the future of the internet. Those in business school to help build the businesses that will be run on top of it. Startups meeting with incumbents, cats and dogs living together.

Which is, hopefully, what makes it as fun as it is professionally useful. It doesn’t hurt, of course, that the conference’s “second track” – my thanks to Alex King for the analogy – is craft beer.

During the day our attendees are asked to wrap their minds around complicated, nuanced and occasionally controversial issues. What are the social implications and ethics of running services at scale? When you cut through the hype, what does IoT mean for our lives and the way we play? Perhaps most importantly, how is our industry industry actually performing with respect to gender issues and diversity? And what can we, or what must we, do to improve that?

To assist with these deliberations, and to simultaneously expand horizons on what craft beer means, we turn lose two of the best beer people in the world, Leigh and Ryan Travers who run Stillwater Artisinal Ales’ flagship gastropub, Of Love and Regret, down in Baltimore. Whether we’re serving then the Double IPA that Beergraphs ranks as the best beer in the world, canned fresh three days before, or a 2010 Italian sour that was one of 60 bottles ever produced, we’re trying to deliver a fundamentally different and unique experience.

As always, we are not the ones to judge whether we succeeded in that endeavor, but the reactions were both humbling and gratifying.

Out of all of those reactions, however, it is ones like this that really get to us.

The fact that many of you will spend your vacation time and your own money to be with us for the Monktoberfest is, quite frankly, incredible. But it just speaks to the commitment that attendees have to make the event what it is. How many conference organizers, for example, are inundated with offers of help – even if it’s moving boxes – ahead of the show? How many are complimented by the catering staff, every year, that our group is one of the nicest and most friendly they have ever dealt with? How many have attendees that moved other scheduled events specifically so that they could attend the Monktoberfest?

This is our reality.

And as we say over and over, it is what makes all the blood, sweat and tears – and as any event organizer knows, there are always a lot of all three – worth it.

The Credit

Those of you who were at dinner will have heard me say this already, but the majority of the credit for the Monktoberfest belongs elsewhere. My sincere thanks and appreciation to the following parties.

  • Our Sponsors: Without them, there is no Monktoberfest
    • IBM: Once again, IBM stepped up to be the lead sponsor for the Monktoberfest. While it has been over a hundred years since the company was a startup, it has seen the value of what we have collectively created in the Monktoberfest and provided the financial support necessary to make the show happen.
    • Red Hat: As the world’s largest pure play open source company, there are few who appreciate the power of the developer better than Red Hat. Their support as an Abbot Sponsor – the fourth year in a row they’ve sponsored the conference, if I’m not mistaken – helps us make the show possible.
    • Metacloud: Though it is now part of Cisco, Metacloud stood alongside of Red Hat to be an Abbot sponsor and gave us the ability to pull out all the stops – as we are wont to do.
    • EMC: When we post the session videos online in a few weeks, it is EMC that you will have to thank.
    • Mandrill: Did you enjoy the Damariscotta river oysters, the sushi from Miyake, the falafel and sliders bar, or the mac and cheese station? Take a minute to thank the good folks from Mandrill.
    • Atlassian: Whenever you’re enjoying your shiny new Hydro-Flask 40 oz growler – whether it’s filled with a cold beverage or hot cocoa – give a nod to Atlassian, who helped maked them possible. Outside certainly approves of the choice.
    • Apprenda / HP: From the burrito spread to the Oxbow-infused black bean soup, Apprenda and HP are responsible for your lunch.
    • WePay: Like your fine new Teku stemmed tulip glassware? Thank WePay.
    • AWS/BigPanda/CohesiveFT/HP: Maybe you liked the ginger cider, maybe it was the exceedingly rare Italian sour, or maybe still it was the Swiss stout? These are the people that brought it to you.
    • Cashstar: Liked the Union Bagels on Thursday or the breakfast burritors? That was Cashstar’s doing.
    • O’Reilly: Lastly, we’d like to thank the good folks from O’Reilly for being our media partner yet again and bringing you free books.
  • Our Speakers: Every year I have run the Monktoberfest I have been blown away by the quality of our speakers, a reflection of their abilities and the effort they put into crafting their talks. At some point you’d think I’d learn to expect it, but in the meantime I cannot thank them enough. Next to the people, the talks are the single most defining characteristic of the conference, and the quality of the people who are willing to travel to this show and speak for us is humbling.
  • Ryan and Leigh: Those of you who have been to the Monktoberfest previously have likely come to know Ryan and Leigh, but for everyone else they reall are one of the best craft beer teams not just in this country, but the world. And they’re even better people, having spent the better part of the last few months sourcing exceptionally hard to find beers for us. It is an honor to have them at the event, and we appreciate that they take time off from running the fantastic Of Love & Regret to be with us.
  • Lurie Palino: Lurie and her catering crew have done an amazing job for us every year, but this year was the most challenging yet due to some late breaking changes in the weeks before the event. As she does every year, however, she was able to roll with the punches and deliver on an amazing event yet again. With no small assist from her husband, who caught the lobsters, and her incredibly hard working crew at Seacoast Catering.
  • Kate (AKA My Wife): Besides spending virtually all of her non-existent free time over the past few months coordinating caterers, venues and overseeing all of the conference logistics, Kate was responsible for all of the good ideas you’ve enjoyed, whether it was the masseuses two years ago, the cruise last year or the inspired choice of venue this. And she gave an amazing talk on the facts and data behind sexual harassment. I cannot thank her enough.
  • The Staff: Juliane did yeoman’s work organizing many aspects of the conference, including the cruise, and with James secured and managed our sponsors. Marcia handled all of the back end logistics as she does so well – and put up with the enormous growler boxes living at her house for a week. Kim not only worked both days of the show, but traveled down to Baltimore and back by car simply to get things that we couldn’t get anywhere else. Celeste, Cameron, Rachel, Gretchen, Sheila and the rest of the team handled the chaos that is the event itself with ease. We’ve got an incredible team that worked exceptionally hard.
  • Our Brewers: We picked a tough week for brewer appearances this year, as we overlapped with no fewer than three major beer festivals, but The Alchemist was fantastic as always about making sure that our attendees got some of the sweet nectar that is Heady Topper, and Mike Guarracino of Allagash was a huge hit attending both our opening cruise and Thursday dinner. Oxbow Brewing, meanwhile, not only connected us with a few hard to get selections, but loaned us some of the equipment we needed to have everything on tap. Thanks to all involved.
  • Erik Dasque: As anyone who attended dinner is aware, Erik was our drone pilot for the evening. He was gracious enough to get his Phantom up into the air to capture aerial shots of the Audubon facility as well as video of our arriving attendees. Wait till you see his video. In the meantime, here’s a picture.

With that, this year’s Monktoberfest is a wrap. On behalf of myself, everyone who worked on the event, and RedMonk, I thank you for being a part of what we hope is a unique event on your schedule. We’ll get the video up as quickly as we can so you can share your favorite talks elsewhere.

For everyone who was with us, I owe you my sincere thanks. You are why we do this, and you are the Monktoberfest. Stay tuned for details about next year, as we’ve got some special things planned for our 5th anniversary, and in the meantime you might be interested in Thingmonk or the Monki Gras, RedMonk’s other two conferences, as well as the upcoming IoT at Scale conference we’re running with SAP in a few weeks.

Categories: Conferences & Shows.

A Swing of the Pendulum: Are Fragmentation’s Days Numbered?

Foucault's Pendulum

One of the lessons that has stayed with me all these years removed from my History major is the pendulum theory. In short, it asserts that history typically moves within a pendulum’s arc: first swinging in one direction, then returning towards the other. I’ve been thinking about this quite a bit in recent months as the predictable result of widespread developer empowerment becomes more and more visible in virtually all of the metrics we track. Unsurprisingly, when you have two populations making decisions, the larger one leads to a wider array of outcomes. CIOs, as an example, were long content to consolidate on a limited number of runtimes – Java, .NET and a few others. All of the data we see, however, suggests that as the New Kingmakers have begun to rise up and act on their own initiative, the distribution of runtimes employed has exploded. The pendulum, quite obviously, had swung from centralized to fragmented, driven by a fundamental shift in the way that technologies were selected.
The question I’ve been pondering is simple: when does it begin to swing back in the other direction?

If there is any reversal here, it will come from developers. Even the large, CIO-centric incumbents are aware today that developers are in charge, so there’s no evidence to suggest that CIOs have a plausible strategy for putting developers back under thumb. But while over the last few years newly empowered developers have shown an insatiable appetite for new technologies, it hasn’t been clear that this trajectory was sustainable longer term.

Which is I’ve been paying attention, looking for evidence that the pendulum swing might be slowing – even reversing. The data is inconclusive. As Donnie has noted, there have only been five languages that really mattered on a volume basis on Github: JavaScript, Ruby, Java, PHP, and Python. And yet our rankings indicate that while they do indeed represent the fat part of the tail, there is substantial, ongoing volume usage of maybe twenty to thirty on top of that.

What the data won’t say, however, developers themselves will. Witness this piece from Tim Bray:

There is a re­al cost to this con­tin­u­ous widen­ing of the base of knowl­edge a de­vel­op­er has to have to re­main rel­e­van­t. One of today’s buz­zwords is “full-stack developer”. Which sounds good, but there’s a lit­tle guy in the back of my mind scream­ing “You mean I have to know Gra­dle in­ter­nals and ListView fail­ure modes and NSMan­agedOb­ject quirks and Em­ber con­tain­ers and the Ac­tor mod­el and what in­ter­face{} means in Go and Dock­er sup­port vari­a­tion in Cloud provider­s? Color me sus­pi­cious.

Which links to this piece by Ed Finkler:

My tolerance for learning curves grows smaller every day. New technologies, once exciting for the sake of newness, now seem like hassles. I’m less and less tolerant of hokey marketing filled with superlatives. I value stability and clarity.

Which elicited this reponse from Marco Arment:

I feel the same way, and it’s one of the reasons I’ve lost almost all interest in being a web developer. The client-side app world is much more stable, favoring deep knowledge of infrequent changes over the constant barrage of new, not necessarily better but at least different technologies, libraries, frameworks, techniques, and methodologies that burden professional web development.

Which in turn prompted a response from Matt Gemmell entitled “Confessions of an Ex-Developer”:

I’m glad there are no compilers (visible) in my life. I’m also glad that I can view the WWDC keynote as a tourist, without any approaching tension headache as I think about what I’ll need to add, or change, or remove. I can drift languidly along on the slow-moving current of the everyday web, indulging an old habit when a rainy evening comes by.

It’s a profoundly relaxing thing to be able to observe the technology industry without being invested in it. I’m glad I’m not making software anymore.

To be clear, these are merely four developers. Four experienced developers, more importantly. It may very well be that their experiences are nothing more than a natural and understandable change in priorities that comes with age.

But their experience seems to mirror a logical reaction to a very rapid set of transformations in this industry. Given the hypothesis that the furious rate of change and creation in technology will at some point hit a point of diminishing returns, then become actively counterproductive, it follows that these could merely be the bleeding edge of a more active backlash against complexity. Developers have historically had an insatiable appetite for new technology, but it could be that we’re approaching the too-much-of-a-good-thing stage. In which case, the logical outcome will be a gradual slowing of fragmentation followed by gradual consolidation. Market outcomes would be dependent on individual differences between rates of change, the negative impacts of fragmentation and so on.

It may be difficult to conceive of a return to a more simple environment, but remember that the Cambrian explosion the current rate of innovation is often compared to was itself very brief – in geologic terms, at least. Unnatural rates of change are by definition unnatural, and therefore difficult to sustain over time. It is doubtful that we’ll ever see a return to the radically more simple environment created by the early software giants, but it’s likely that we’ll see dramatically fewer popular options per category.

Whether we’re reaching apex of the swing towards fragmentation is debatable, less so is the fact that the pendulum will swing the other way eventually. It’s not a matter of if, but when.

Categories: Programming Languages.