Skip to content

What IBM Joining the Cloud Foundry Project Means

When the OpenStack project was launched in 2010, IBM was one of many vendors in the industry offered the opportunity to participate. And though OpenStack launched with a nearly unprecedented list of supporters, IBM was not among them. In spite of their lack of a public commitment to an existing open source cloud platform – they had their own service offering in SmartCloud – they declined to join the project.

Until they did two years later.

In 2012, IBM joined along with Red Hat, another industry player that had passed on the initial opportunity to get on the OpenStack train. The original decision and the subsequent about face may seem contradictory, but it is nothing more or less than the inevitable consequence of how IBM approaches emerging markets.

For many customers, particularly risk averse large enterprises and governments, one of IBM’s primary assets is trust. IBM is in many respects the logical reflection of its customers, who are disinclined – for better and for worse – to reinvent themselves technically as each new wave of technology breaks, as each new “game changing” technology arrives. Instead, IBM adopts a wait and see approach. It was nine years after the Linux kernel was released that IBM determined that the project’s momentum, not to mention the potential strategic impact, made it a worthwhile bet. At which point they promised to inject $1 billion dollars into the ecosystem, a figure that represented a little over 1% of their revenue and fully a fifth of its R&D expenditures that year.

Which is not to compare IBM’s commitment last week to Cloud Foundry to its investment in Linux, in either dollars or significance. As much as one-time head of VMware now-head of Pivotal Paul Maritz is seeking to make Cloud Foundry “the 21st-century equivalent of Linux,” even the project’s advocates would be likely to admit there’s a long way to go before such comparisons can be made.

The point is rather that when evaluating the significance of IBM’s decision to publicly back Cloud Foundry, it’s helpful to put their decision making in context. Decisions of this magnitude cannot be made lightly, because IBM cannot return to enterprise customers who have built on top of Cloud Foundry at their recommendation in two years with a mea culpa and a new platform recommendation.

IBM’s support for the Cloud Foundry project signals their belief that the PaaS market will be strategic. Given the aforementioned context, it also means that after an extended period of evaluation, IBM has decided that Cloud Foundry represents the best bet in terms of technology, license and community moving forward. These are the facts, as they say, and they are not in dispute. The primary question to be asked around this announcement, in fact, is less about Cloud Foundry and IBM – we now know how they feel about one another – and more to do with what it portends for the PaaS market more broadly.

A great many in the industry, remember, have written off Platform-as-a-Service for one reason or another. For some VC’s it’s the lack of return from various PaaS-related investments, for the odd reporter here or there it’s the lack of traction for early PaaS players like or Google App Engine relative to IaaS generally and Amazon specifically. And for developers, it’s frequently the question of whether yet another layer of abstraction needs to be added to virtual machine, IaaS fabric, operating system, runtime / server, programming language framework and so on. The developer’s primary complaint used to be the constraints – run time choice, database options and so on – but these have largely subsided in the wake of what we term third generation PaaS platforms. PaaS platforms that offer multiple runtimes and other choices, in other words. Platforms like Cloud Foundry, OpenShift and so on.

But while it’s difficult to predict the future of PaaS, particularly the rate of uptake – certainly it hasn’t gone mainstream as quickly as anticipated here – the history of the industry may offer some guidance. For as long as we’ve had compute resources, additional layers of abstraction have been added to them. Generally speaking this has been for reasons of accessibility and convenience; it’s easier to code in Ruby, as but one example, than Assembler. But some abstractions, middleware in particular, have long served business needs by offering greater portability between application environments. True, the compatibility was never perfect, and write-once-run-anywhere claims tried the patience of anyone who actually tried it.

Greater layers of abstraction, nevertheless, appear inevitable, at least from a historical perspective. Few would debate that C is a substantially more performant language than JavaScript. Regardless of this advantage, accessibility, convenience and other factors such as Moore’s Law have conspired to advantage the more abstract, interpreted language over the closer-to-the-metal C as demonstrated in this data from Ohloh.

Will PaaS benefit from the long term industry trend towards greater levels of abstraction? Having corrected many of the early mistakes that led to premature dismissals of PaaS, it’s certainly possible. Oddly, however, many of the would-be players in the space remain reluctant to make the obvious comparison, that PaaS is the new middleware. Rather than attempt to boil the ocean by educating and evangelizing the entire set of capabilities PaaS can offer, it would seem that the simplest route to market for vendors would be to articulate PaaS as an application container, one that can be passed from environment to environment with minimal friction. It’s not a dissimilar message from the idea of “virtual appliances” that VMware championed as early as 2006, but it has the virtue of being more simple than packaging up entire specialized operating systems, and is thus more likely to work.

If we assume for the sake of argument, however, that PaaS will continue to make gains with developers and the wider market, the question is what the landscape looks like in the wake of the Cloud Foundry-IBM announcement. It’s obviously early days for the market; IBM-approved or no, Cloud Foundry isn’t yet listed as a LinkedIn skill, and the biggest LinkedIn user group we track had a mere 195 members as of July 15th. But in an early market, the IBM commitment is unquestionably a boost to the project. Open source competitors such as Red Hat’s OpenShift project, closed source vendors like Apprenda, hosted providers like Engine Yard, or GAE will all now be answering questions about Cloud Foundry and IBM, at least in their larger negotiated deals.

As it always does, however, much will come down to execution. Specifically, execution around building what developers want and making it easy for them to get it. All the engineering and partnerships in the world can’t save a project that makes developers lives harder, as we’ve already seen with the first wave of PaaS vendors that failed to take over the world as expected. Whether or not Cloud Foundry can do that with the help of IBM and others will depend on who wins the battle for developers, and that’s one that’s far from over.

Disclosure: IBM is a RedMonk customer, as are Apprenda, Red Hat and Pivotal is not a RedMonk customer, nor are Google or Engine Yard.

Categories: Cloud, Platforms.

The RedMonk Programming Language Rankings: June 2013

[January 22, 2014: these rankings have been updated here]

A week away from August, below are our programming language ranking numbers from June, which represent our Q3 snapshot. The attentive may have noticed that we never ran numbers for Q2; this is because little changed. Which is not to imply that a great deal changed between Q1 and Q3, please note, but rather than turn this into an annual exercise snapshots every six months should provide adequate insight into the relevant language developments occuring over a given time period.

For those that are new to this analysis, it is simply a repetition of the technique originally described by Drew Conway and John Myles White in December of 2010. It seeks to correlate two distinct developer communities, Github and Stack Overflow, with one another. Since that analysis, they have published a more real time version of their data available for those who wish day to day insights. In all of the times that this analysis has been performed, the correlation has never been less than .78, with this quarter’s correlation .79.

As always, there are caveats to be aware of.

  • No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
  • There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis.We encourage, however, interested parties to perform their own analyses using other sources.
  • All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinuishable from the next. The separation between language tiers, however, is representative of substantial differences in relative popularity.
  • In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top 20 to 30 languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.

With that, here is the third quarter plot for 2013.

(embiggen the chart by clicking on it)

Because of the number of languages now included in the survey and because of the nature of the plot, the above can be difficult to process even when rendered full size. Here then is a simple list of the Top 20 Programming Languages as determined by the above analysis.

  1. Java *
  2. JavaScript *
  3. PHP *
  4. Python *
  5. Ruby *
  6. C# *
  7. C++ *
  8. C *
  9. Objective-C *
  10. Shell *
  11. Perl *
  12. Scala
  13. Assembly
  14. Haskell
  15. ASP
  16. R
  17. CoffeeScript
  18. Groovy
  19. Matlab
  20. Visual Basic

(* denotes a Tier 1 language)

Java advocates are likely to look at the above list and declare victory, but Java is technically tied with JavaScript rather than ahead of it. Still, this is undoubtedly validation for defenders of a language frequently dismissed as dead or dying. Java’s ranking rests on its solid performance in both environments. While JavaScript is the most popular language on Github by a significant margin, it is only the fourth most popular language on Stack Overflow by the measure of tag volume. Java, meanwhile, scored a third place finish on Github and second place on Stack Overflow, leading to its virtual tie with perennial champ JavaScript. Not that this is a surprise; Java has scored a very close second place to JavaScript over the last three snapshots.

Elsewhere, other findings of note.

  • Outside of Java, nothing in the Top 10 has changed since the Q1 snapshot.
  • For the second time in a row, ASP lost ground, declining one spot.
  • For the first time in three periods, R gained a spot.
  • Visual Basic dropped two spots after rising one.
  • Assembly language, interestingly, jumped two spots.
  • After breaking into the Top 20 in our last analysis, Groovy jumped up to #18.
  • After placing 16th the last two periods, ActionScript dropped out of the Top 20 entirely.

Outside of the Top 20, Clojure held steady at 22 and Go at 28, while D dropped 5 spots and Arduino jumped 4.

In general, then, the takeaways from this look at programming language traction and popularity are consistent with earlier findings. Language fragmentation, as evidenced by the sheer number of languages populating the first two tiers, is fully underway. The inevitable result of which is greater language diversity within businesses and other institutions, and the need for vendors to adopt multiple-runtime solutions. More specifically, this analysis indicates a best tool for the job strategy; rather than apply a single language to a wide range of problems, multiple languages are leveraged in an effort to take advantage of specialized capabilities.

Categories: Programming Languages.

Why Software Platforms Should Be More Like Pandora

For many years after the de facto industry standardization on the MP3 format, the primary problem remained music acquisition. There were exceptions, of course: serious Napster addicts, participants in private online file trading or even underemployed office workers who used their company LAN to pool their collective music assets. All of these likely had more music than they knew what to do with. But for the most part, the average listener maintained a modestly sized music catalog; modest enough that millions of buyers could fit the entirety of their music on the entry level first generation iPod, which came with a capacity of 5 GB. Even at smaller, borderline-lossy compression levels – which weren’t worth using – that’s just over a thousand songs.

These days, however, more and more consumers are opting into platforms with theoretically unlimited libraries behind them. From iTunes Radio to Pandora to Play’s All Access to Rdio to Spotify, listeners have gone from being limited by the constraints of their individual music collection to having virtually no limits at all. Gone are the days when one needed to purchase a newly released album, or even more worse, drive to a store to buy it. Instead, more often than not, it’s playable right now – legally, even.

The interesting thing about music lovers getting what they always wanted – frictionless online access to music – was that it created an entirely new set of problems. Analysis paralysis, the paradox of choice, call it what you will: it’s become exponentially harder to choose what to listen to.

Which is why those who would continue to sell music are turning to data to do so. Consider iTunes Genius, for example, introduced in 2008. It essentially compares the composition of your music library and any ratings you might have applied to the library and ratings of every other Genius user. From the dataset created from the combined libraries, it automatically generates a suggested playlist based on a seed track. While it seems like magic, because curating playlists manually can be tedious, it’s really nothing more than an algorithmic scoring problem on the backend. Pandora takes an even more direct route, because it has real-time visibility into both what you’re listening to as well as metadata about that experience: did you rate it thumbs up or down, did you finish listening to it, did you even listen to it at all, are there other similar bands you wish played in the channel? All of this is then fed right back into the algorithms which do the best they can to pick out music that you, and thousands of other users similar to you, might like.

While the approaches of these and other services may differ, what they have in common is simple: a critical mass of listeners who are all voluntarily – whether they know it or not – building an ever larger, and ideally ever smarter, dataset of musical preferences on behalf of the vendor they’re buying from.

This is one of the examples that software companies should be learning from, although that should be “non-music” software companies, since just about every important new music company, including the examples above, is a software company first, music company second. Like the music companies, software companies should increasingly not be focused merely on the asset they wish to sell – software, in most cases – but data they might be in a position to collect that can be used to sell that software. Or as a saleable asset in and of itself.

For example, consider the case of a PaaS platform vendor. While the intial generation of platforms – GAE,, etc – were very opinionated in that they dictated runtime, database, schema and so on, the majority of players today offer multiple choices. Database options might include MongoDB, MySQL and Postgres, while runtimes might range from Java to JavaScript to PHP to Python to Ruby.

Many incoming customers, of course, may already know what technologies they prefer; they may even be locked into those choices. But those who haven’t made choices, and even some of those who have, would appreciate more detailed information on usage across the platform. What if, for example, you have real-time or near real-time numbers for the adoption of MongoDB, for example, which indicate exploding traction amongst other users of the platform? Or a spike in JavaScript runtime consumption? Even more interesting, how are the databases trending broadly versus for customers of a given size? Every choice a customer makes – to use Java, to deploy a MySQL instance – is the equivalent of a Pandora “Like” signal. But you have to capture these.

Like music services, most technology platforms – particularly those that are run in a service context – are generating valuable data that can be used to inform customer choices. To date, however, very few platform providers are even thinking about this data in a systematized fashion let alone exposing it back to their customers in meaningful ways. We know this because we ask about it in every briefing.

Those customers that embrace a software plus data approach, therefore, are likely to have a competitive advantage over their peers. And importantly, it’s the rare competitive advantage that becomes a larger barrier to entry – a data moat, if you will – over time.

Categories: Platforms.

Open Source Foundations in a Post-GitHub World

Solar Eclipse 2009 (NASA, Hinode, 7/22/09)

Two years ago Mikeal Rogers wrote a controversial piece called “Apache considered harmful” that touched a nerve for advocates of open source software foundations. Specifically, the piece argued that the ASF had outlived its usefulness, but in reality the post-GitHub nature of the criticism applied to a wide range of open source foundations.

For many years, open source foundations such as Apache counted project hosting as one of their core reasons for being. But in the majority of cases, the infrastrcture supporting this functionality was antiquated, as few of the foundations had embraced modern Distributed Version Control Systems such as Git. The Eclipse Foundation, for example, had a number of projects controlled by CVS, an application whose first release was in 1990. The ASF, meanwhile, was fully committed to its own Subversion project, a centralized VCS that was over a decade old at the time of Rogers’ post.

Outside the foundations, meanwhile, the traction of GitHub’s implementation of Git had exploded. It had become, almost overnight, the default for new project hosting. And because GitHub was in the business of hosting a version control system, and paid for it, it was no surprise that the quality of their hosting implementation was substantially better than what open source foundations like Apache or Eclipse could offer.

This preference for GitHub’s implementation led some developers, like Rogers, to question the need for foundations like Apache or Eclipse. In a world where GitHub was where the code lived and the largest population of developers was present, of what use were foundations?

One answer, in my view, was brand. Others included IP management, project governance, legal counsel, event planning, predictable release schedules and so on. But even assuming those services represent genuine value to developers, it would be difficult to adequately offset GitHub’s substantial advantages in interface and critical mass. GitHub makes a developer’s life easier now; intellectual property policies might or might not make their life easier at some point in the future.

As of this morning, however, developers at one foundation no longer need to choose. As the Eclipse Foundation’s FAQ covers, the Eclipse Foundation will now permit projects – just new ones, for the time being – to host their primary repository external to the foundations servers at GitHub.

The move is not without precedent; the OuterCurve (neé CodePlex) Foundation has permitted external hosting for several years. But the announcement by Eclipse is one of the first large mature foundations to explicitly fold external properties such as GitHub into its workflow.

This change should benefit everyone involved. Properties like GitHub gain code and developers, foundations can focus on areas they’re likely to add more value than project hosting, and developers get the benefits of a software foundation without having to sacrifice the tooling and community they prefer. For this reason, it seems probable that over time this will become standard practice, particularly as foundations look to stem criticism that they’re part of the problem rather than part of the solution. In the short term, however, there are likely to be some bumps in the road as new school populations within the foundations push their old school counterparts for change. Eclipse will in that respect be an interesting case study to watch.

Either way, while Eclipse may be the first large foundation to adapt itself to the post-GitHub environment, but it’s unlikely to be the last.

Disclosure: The Eclipse and OuterCurve Foundations are RedMonk clients.

Categories: Open Source.

The Google Cloud Platform Q&A

While the bulk of the attention at Google I/O last week, at least in terms of keynote airtime, was devoted to improvements to user-facing projects like Android and Chrome, the Cloud team had announcements of their own. Most obviously, the fact that the Google Compute Engine (GCE) had graduated to general availability. Both because it’s Google and because the stakes in the market for cloud services are high, there are many questions being asked concerning Google’s official entrance to the market. To address these, let’s turn to the Q&A.

Q: The first and perhaps most obvious question is why now? Or more critically, why did it take Google so long for GCE to GA?
A: The flip answer is to point to how long Gmail was in beta. Google, historically, has had no reluctance to preview their services to a limited audience, a necessary precaution in many cases given their scale. The way one story goes, Google was forced to scramble for mere bandwidth following the release of Google Maps, having substantially underestimated the overwhelming demand for what was, at the time, a revolutionary mapping product. At scale, even simple things become hard. And delivering IaaS services, while a solvable problem, is not simple.

All of that said, Google’s late entrance to this market is also likely to be the product of a strategic misstep. Consider that Google App Engine – the company’s PaaS platform, and one of the first to market – has been available since 2008. It has been abundantly clear in the years since that, while PaaS may yet become a mainstream application deployment model, IaaS is more popular by an order of magnitude or more. Whether it was Google’s belief that PaaS would eventually become the preferred choice over IaaS, or whether Google had questions about their interest or ability to execute effectively in that type of a business, the fact is that they’re seven years late to market.

Q: So is it too late?
A: Surprisingly, the answer is probably not. Google’s delay has certainly created an enormous hill to climb; Amazon has spent the past seven years not only inhaling the market, they’ve actually been able to sustain a remarkable pace of innovation while doing so. Rather than being content with a few core services, Amazon has continued to roll out new capabilities at an accelerating rate. And in a departure from traditional IT supplier practices, they have lowered their prices rather than raised them. Repeatedly.

All of that said, two factors are in Google as well as other would-be Amazon competitors’ favor. First, far more workloads are not running in public clouds today than are. This means that as impressive as the growth in the cloud sector has been, a great deal of oxygen remains. Second, cloud infrastructure is by design more ephemeral than the physical alternatives that preceded it. It’s far more difficult to decommit from thousands of physical machines than cloud instances. While migrations between public clouds, then, are not without complication or risk, they are more plausible than customers swapping out their on premise infrastructure wholesale for a competitor.

So while Google’s delay was costly, it is unlikely to be fatal.

Q: Is Google serious? Or are these cloud services just more Google experiments that will be shut down?
A: It may be natural to ask this question in the wake of the house cleaning Google’s done over the past few years, shuttering a variety of non-core projects. There is no real evidence that this concern is legitimate regarding the Google cloud offerings, however. In App Engine, Google has technically been in market for years, and in that time, they have ramped their involvement up, not down. GAE has expanded its capabilities, multiple datastore options have been launched, GCE has been previewed and then released as a production product.

Google also probably cannot afford to sit this one out. A world in which an increasing number of compute workloads run on infrastructure maintained by competitors like Amazon or Microsoft is a multi-dimensional threat to Google’s business. Besides infusing those businesses with capital that can be used to subsidize efforts to attack Google in areas like mobile, owning customer relationships via cloud sales may allow competitors to cross-sell other services, such as collaboration or even advertising.

For those still not reassured, it’s worth noting that – like Amazon – Google is compelled to maintain large scale infrastructure as part of its core business. While its primary revenue source is obviously advertising, Google is at its core an infrastructure company. Which means that reselling infrastructure is not exactly a major departure from its business model.

Q: So Google’s serious about the cloud market – are they equally serious about the enterprise market?
A: The answer to this depends in part on how you believe cloud is currently being adopted by enterprises. If you’re of the belief that enterprise cloud adoption will resemble that of traditional infrastructure, Google does not currently appear to quote unquote serious about the enterprise market. Certainly they are not offering at present the certification program, for example, that Amazon is in an attempt to court enterprise buyers. Google’s recent standardization on Debian, in fact, could be construed as an active rejection of enterprise requirements; CentOS, at least, would represent an opportunity to market to current Red Hat customers.

What if, however, you believed that cloud adoption was proceeding not from the top down but rather the bottom up? What if you believed that developers were leading the adoption of cloud services within the enterprises? How might you optimize your offering for developer adoption? Well, you might begin by standardizing on the preferred distribution of developers. Which would be, according to the research of my colleague Donnie Berkholz, none other than Debian-based distros. You might price competitive with the current developers’ choice, Amazon, and go one step further to offer sub-hourly billing. And you’d obviously expose the whole thing via a single JSON API, accessible via a command line tool.

The punchline, of course, is that Google has done all of the above. In a perfect world, you would build cases for both developer and enterprise, as Amazon has done. But playing from behind, Google appears to be betting on the developer rather than pursuing the features that would appeal to traditional enterprise buyers.

If you think developers are playing a deciding role with respect to adoption, then, within the enterprise, you can argue that Google is serious about that market. If you believe that CIOs remain firmly in control, then no, Google is not serious about the enterprise.

Q: What was the most significant cloud-related announcement from I/O?
A: The answer depends on timeframe. In the short term, the addition of PHP support on App Engine dramatically expands that platform’s addressable market. Likewise, the more granular pricing will potentially lower costs while allowing developers the ability to experiment.

Over the longer term, the introduction of the non-relational Google Datastore gives GCE an alternative to Amazon’s Dynamo or SimpleDB, as well as the countless other NoSQL databases saturating the market, and a complement to their existing BigQuery and Cloud SQL (MySQL-as-a-Service). Given the massive popularity of non-relational stores, this announcement may be the most significant over the longer term.

Q: How serious a threat is Google to Amazon’s cloud? Or Microsoft’s, or Rackspaces for that matter?
A: I argued in my 2013 predictions piece that Google would be the most formidable competitor Amazon has yet faced, and nothing that’s occurred since has caused me to rethink that position.

In the short term, neither Google nor anyone else will challenge Amazon, whose dominance of the cloud is substantially understated, in my opinion, by this 451 Group survey indicating a 19% market share. The Register, meanwhile, points to the disparity in available services. Amazon is to the cloud what Windows was to operating systems and what VMware is to virtualization, and it would be difficult to build the case otherwise.

Over the medium to longer term, however, Google has economies of scale, expertise in both software and infrastructure, and existing relationships with large numbers of developers. More specifically:

[Google] has the advantage of having run infrastructure at a massive scale for over a decade: the search vendor is Intel’s fifth largest customer. It also has deep expertise in relevant software arenas: it has run MySQL for years, the company was built upon customized versions of Linux and it is indirectly responsible for the invention of Hadoop (via the MapReduce and GFS papers).

Google’s a fundamentally different type of competitor to AWS, and there are signs that Amazon recognizes this.

Which is what will make the months ahead interesting to watch.

Disclosure: Amazon, Microsoft and VMware are RedMonk clients, Google is not.

Categories: Cloud.

Why There Was No New Hardware At Google I/O

Google IO 2013

Google announced so many things yesterday that it makes my head spin.” – Fred Wilson

The challenge with a conference like Google I/O, where the announcements arrive one after another, is to see both forest and trees. Analysis of individual announcements – such as Google’s new Pandora/Rdio/Spotify competitor All Access, or the granular pricing for its compute infrastructure – is relatively straightforward. What’s more important, however, is perceiving the larger pattern.

The most obvious feature of Google I/O is the emphasis on the developer. As they have in years past, Google demonstrated their commitment to developers financially, handing out over a thousand dollars of free hardware in the Chromebook Pixel. But the content itself reflected this prioritization. Rather than easing into the keynote with something accessible to non-programmers such as All Access, Google devoted fully the first forty minutes to API announcements. And then followed that up with the release of a new Android development tool that is already eliciting favorable comparisons to Apple’s Xcode. And so on. I/O is, to its credit, remaining true to its roots – it is a developer show first, and everything else second.

So Google gets the importance of developers: this does not exactly qualify as news.

Perhaps less obvious, however, was the strategy implied by the announcements. Many were surprised – in spite of the hints ahead of the show – that Google did not unveil a new piece of hardware, or even an updated version of their Android operating system. There was even disappointment in some quarters; a few of the developers seated behind me were grumbling that there wasn’t even an update to the Nexus 7, as had been speculated ahead of the show. Nor should the disappointment have been a surprise: Apple has created the expectation that developer events must also serve as launch platforms for hardware and software. Developers have been conditioned to expect new hardware, new operating systems and more.

By not even announcing either a new device – unless you count the already available Samsung Galaxy device running stock Android that will be sold in late June – or a new version of the operating system, Google is telegraphing their belief that the basis of competition lies elsewhere.

Some might argue that this is less of a strategic statement than a matter of timing; that Google simply didn’t have either a new operating system or device to present. And there is truth in that. With its Nexus 4 device less than six months old, an X phone announcement was always unlikely. The past four releases of Android, meanwhile, have arrived in either July or November/December. It is hard to make the argument, however, that Google could not have at least previewed upcoming technologies. Technology companies reveal products far ahead of their production readiness all the time.

The statement made by Google yesterday, instead, is that the war for mobile will not be won with devices or operating systems. It will be won instead with services.

Last November, Patrick Gibson argued that Google was getting better at design faster than Apple was getting better at services. While Google’s design credibility can be debated, Apple’s history in services cannot. While its systemic issues have been damaging enough to require more than one apology from the company, Apple has – in spite of its resources – seemingly made little progress in the services area. Those who have worked with the company point to cultural issues as one factor – the company’s secrecy can make it difficult for infrastructure teams to work effectively together – but whatever the reason, Apple has been less successful in services than it has in virtually any other area of its business. At a time when services are becoming more important to users.

Now consider what Google announced at I/O yesterday:

  • Commerce services (instant buy, wallet objects, send money in Gmail)
  • Education services (apps sorted by class / grade level, automated rollout to classes)
  • Collaboration services (cross-platform persistent conversations, video chat)
  • Game services (save to cloud, multiplayer, etc)
  • Map services (activity recognition, geofencing, low power location)
  • Market improvements (in-market translation, automated application recommendations)
  • Music services (All Access, curated playlists)
  • Now services (media recommendations, public transit commute times, reminders)
  • Photo services (automated photo triage, generated motion images, automatic improvements)
  • Search services (hotword triggering)

And that’s without getting into any of the Google Cloud Platform announcements. Apple has competitive offerings – superior offerings, in some cases – to some of the above. But it’s missing many, and in others, Maps most notably, Apple lags considerably behind. Hence Google’s approach, in which it attempts to apply its strengths in delivering services at scale to Apple’s perceived weakness – delivering services at scale.

Whether Google’s strategy here is successful depends in part on timing and the commoditization of the user interface. Essentially Google needs Android to be, at a minimum, a “good enough” user interface to be considered a reasonable alternative for a large enough subset of the addressable market to make Google’s advantages in services relevant. Two or three years ago, this was not the case. Today, it might be. Apple, meanwhile, needs to increase the distance between Android and iOS enough to give itself time to either build or acquire a competency in services.

Either way, two things are clear; the developers, as ever, are firmly in control, and WWDC in June should be very interesting.

Categories: Mobile.

Capsule, The Developer’s Code Journal

Most of the charts and analysis you see in this space is done, as a few of you know, via R and, more specifically, R Studio. R Studio is an excellent tool that streamlines the process of working with R, and while it’s certainly not necessary to work with the language, I recommend it to those looking for a more comprehensive interface. As much as I appreciate the tool, however, it has never obviated my need for a scratchpad. Like a lot of developers, I often work outside of my chosen development environment, maintaining a separate Sublime Text window to capture snippets of code, notes on what they do and how they do it, and more. And like a lot of developers, I’ve never really thought about this scratchpad or the process behind it.

I do eventually migrate a subset of these snippets from Sublime – how to get ggplot to generate a stacked chart incorporating negative values, for example – into a Google Doc for more permanent storage than just another an open tab. But the process is manual, imperfect and not collaborative at all. All of which may help explain why I think Alex King’s new project Capsule is both important and relevant to anyone interested in the craft of software development. As he put it, it’s potentially a solution to a problem that I didn’t know I had.

Described as the “Developer’s Code Journal,” Capsule is a WordPress based replacement for the scratch document that you maintain to capture everything that doesn’t fit well within your development tool (be that a text editor or IDE) of choice: extended comments, outlines, code snippets and so on. Basically, its function is akin to a diary or journal, but one designed and built to cater specifically to the art and task of coding.

As you’d expect, it incorporates a good code editor (the same as GitHub) with language autocomplete and so on. Even better, you can organize these by project (@ notation) or tag (# notation) for easy retrieval later. And as a web based application rather than a local text doc, it’s possible to build a collective scratchpad equivalent, collaboratively, containing the shared thoughts of development teams.

If you’re interested in the craft of software, and I’d hope that most of the people reading this fit that description, this is a project I’d recommend looking at. And given that it’s open source – it’s up on GitHub here – potentially even contributing back to.

The only question I’m trying to decide on before implementing it is whether to do so locally and back up the database to Dropbox regularly as Alex does, or remote, which removes a few complications but means that you lose access to the data when offline.

Either way, it’s a tool that I expect to be using soon.

Disclosure: Alex is a friend of mine.

Categories: Application Development, Collaboration, Open Source.

What the OpenStack Foundation Needs to Do

Jonathan Bryce, Lauren Sell, Mark Collier

In the wake of last week’s well attended OpenStack Summit, there has been much discussion of the state of the project. As is typical, this ranges from heated criticism of the project’s community, governance or technology to grandiose claims regarding its trajectory and marketplace traction. And as is typical, the truth lies somewhere in between.

Critics of the project suggesting that it has real organizational issues and engineering shortcomings to address are correct. As are proponents arguing that the project’s momentum is accelerating, both via additions to its community and by the lack thereof from competitive projects and products. The former is, in all probability, the more important of the two developments. Engineering quality is important, but as we tell all of our clients, it has become overvalued in many technology industry contexts. With the right resources, quality of implementation is – usually – a solvable problem. The lack of a community, and attendant interest, is much less tractable. More often than not, the largest community wins.

In the case of OpenStack, however, this can be considered a positive for the project only as long as there is one OpenStack community. It is unclear that this will remain the case moving forward.

Historically, some of the most important and highest profile platform technologies – Linux being the most obvious example – have been reciprocally licensed. In practical terms, this requires vendors distributing the codebase to make any modifications to it available under precisely the same terms as the original code. OpenStack, like Cloud Foundry, Hadoop and other younger projects, is permissively licensed. Unlike reciprocally licensed assets, then, distributors of OpenStack technologies are not required to make any of their bugfixes, feature upgrades or otherwise available under the same terms, or indeed available at all.

Though not required by the license, the overwhelming majority of code is contributed back to the project, because there is little commercial incentive to heavily differentiate from OpenStack. There are, however, commercial incentives to differentiate in certain areas. Which could, over the longer term, lead to fragmentation within the OpenStack community.

To combat this, the OpenStack Foundation and its Board of Directors must make two difficult decisions regarding compatibility.

First, it needs to answer a currently existential question regarding OpenStack: specifically, what is it, exactly? What constitutes an OpenStack instance? One interpretation is that an OpenStack instance is one that has implemented Nova and Swift, the compute and object storage components within OpenStack. What of vendors or customers who have found Swift wanting, and turned to Ceph or RiakCS, then, as an alternative? Are they not OpenStack? Further, how might the definition of what constitutes an OpenStack project evolve over time? Over what timeframe, for example, might customers have to implement Quantum (networking), Keystone (identity), Heat (orchestration) to be considered ‘OpenStack?’

Answering this question will involve difficult decisions for the OpenStack project, because opinions on the answer are likely to vary depending on the nature of existing implementations and the larger strategies they reflect. Because much of OpenStack’s value to customers – and the marketing that underpins it – lies in its avoidance of lock-in, however, answering this question is essential. A customer that cannot move with relative ease from one OpenStack cloud to another because the underlying storage substrates differ is, open source or no, effectively locked in.

The OpenStack Foundation could decline to take an aggressive position on this question, leaving it to the market to determine a solution. This would be a mistake, because as we’ve seen previously in questions of compatibility (e.g. Java), trademark is the most effective weapon to keeping vendors in line. OpenStack implementations that are denied the right to call themselves OpenStack as a result of a breach of interoperability guidelines are effectively dead products, and vendors know it. Given that the Foundation controls the trademark guidelines, then, it is the only institution with the power to address the question of what is OpenStack and what is not.

Assuming that the question of what foundational components are required versus optional in an OpenStack implementation can be answered to the market’s satisfaction, the second cause for concern lies in compatibility between the differing implementations of those foundational components. The nature of implementations, for instance, may introduce unintended, accidental incompatibilities. Consider that shipped distributions are likely to be based on older versions of the components than those hosted, which are frequently within a week or two of trunk. How then can a customer seeking to migrate workloads to and from public and private infrastructure be sure that they will run seamlessly in each environment?

This type of interoperability is by definition more complex, but it is not without historical precedent. As discussed previously in the context of Cloud Foundry, one approach the Foundation may wish to consider is Sun’s TCK (Technology Compatibility Kit) – should a given vendor’s implementation fail to pass a standard set of test harnesses, it would be denied the right to use the trademark. Indeed, this seems to be the direction that Cloud Foundry itself is following in an attempt to forestall questions of implementation compatibility.

Ultimately, the pride on display at the OpenStack Summit last week was well justified. The project has come a long way since its founding, when several of its now members declined to participate after examining the underlying technologies. But its future, as with any open source project, depends heavily on its community, which in turn is dependent on the Foundation keeping that community from fragmenting. The good news for OpenStack advocates is that there are indications the board understands the importance of these questions, and is working to address them. How effective they are at doing so is likely to be the single most important factor in determining the project’s future.

Disclosure: Multiple vendors involved in the OpenStack project, including Cisco, Cloudscaling, Dell, HP, IBM, Red Hat and are RedMonk customers. VMware, which is both a participant in the OpenStack community and a competitor to it, is a customer.

Categories: Cloud, Open Source.

Academia and Programming Language Preferences

For years now, RedMonk has argued that programming language usage and overall diversity is growing rapidly. With developers increasingly empowered to select the best tool for the job rather than having to content themselves with the one they are given, the fragmentation of runtimes in use has unsurprisingly been heavy. Where enterprises used to be at least superficially built on a small number of approved programming languages, today’s enterprise is far more heterogeneous than in years past, with traditional compiled languages (C/C++) coexisting along with managed alternatives (C#/Java) as well as a host of dynamic options (JavaScript, PHP, Python, Ruby).

While this trend is easily observed in a variety of contexts, one question that hasn’t been asked to date is how the academic world is adapting to, coping with or driving this change. What role have colleges and universities played in the proliferation of alternative languages?

To answer this, our own Marcia Chappell researched the published computer science curriculums at the Forbes Top 10 Colleges and Universities. We picked Forbes as opposed to alternatives like US News and World Report because it did not differentiate between size of institution, theoretically providing us with a broader sample.

Unfortunately, the research proved in many cases fruitless. Over 519 courses with descriptions published online, we were only to collect a mere 93 mentions of particular programming languages or runtimes. Many courses do not include specifics regarding languages taught, either because it may vary by instructor or because the language is viewed as less important than the course material. As a result, it’s impossible to draw any statistically significant conclusions from the data, because sampling it properly is impractical or impossible given the nature of the online course descriptions.

Of those courses that did publish information about the technologies taught, however, here is the distribution.

With the above caveat that this chart cannot be considered actually representative of the curriculum of the institutions in question, nor could those institutions be considered representative of the wider academic community, this chart does prompt questions about the content of today’s academic computer science and related coursework.

  • The academic affinity for traditional languages such as C, C++ and Java is understandable. These have not only been staples of computer science careers for decades, they undoubtedly reflect a traditional academic approach which argues in favor of an education built on principles (see, for example, Joel Spolsky’s answer here or his piece here). The under-representation of dynamic alternatives, however, is noticeable. Python is mentioned less than half as much as C and Java, JavaScript is mentioned a bit more than half as much as Python, and Perl/PHP half as much as JavaScript. Ruby isn’t mentioned at all.
  • The relative infrequency with which R was mentioned generally – and particularly relative to MATLAB – was somewhat surprising. In my experience, more academic statisticians today seem to be advantaging R over MATLAB, which this data contradicts.
  • With the exception of Java (Android), mobile appears to be substantially under-represented within this slice of academia. C# (Xamarin) and Objective C (iOS) reflect a minimal presence. Clearly some programs are catering to increased demand for mobile skills (see the Harvard Extension School’s CompSci E-76) but these efforts appear to be, at this point, rare.

The most interesting question raised by this examination, of course, is not necessarily what role academia plays in language adoption, but rather what role it should play. Is the purpose of college coursework to provide students with the equivalent of a classical education in Greek and Latin, a deep understanding of the foundation of the discipline? Or should universities be more responsive to industry trends and demands from the job market, particularly in a context in which their graduates are facing a highly challenging hiring climate?

Ideally, students might have a choice. Those seeking long term careers as computer scientists and software engineers have the depth of coursework necessary to ground them moving forward, while those seeking the most marketable and in demand skills might pick from courses satisfying that shallower need. In practical terms, however, universities – at least those surveyed here – seem to be far more focused on the former than the latter. The question then becomes whether that approach best serves their students.

Categories: Education.

Roll Your Own Hardware and The Disruption of the Enterprise Server Market


In December of 2004, Adam Bosworth wrote a seminal essay entitled “Where have all the good databases gone.” Anticipating by years the rise of the MapReduce/NoSQL movements, it succinctly identified the central problem: “The products that the database vendors were building had less and less to do with what the customers wanted.”

In the years since, it has become clear that databases are not the only software area challenged in this respect. For the first time since the rise of Microsoft, Oracle and other large software players, businesses of all shapes and sizes are beginning to turn not to vendors for technical solutions, but to their own staff, or the software products of other businesses released as open source software. The software industry, in other words, is in the process of being disrupted by its would be customers: we’re seeing the return of roll your own.

None of which is particularly surprising. As the constraints of available software and hardware are removed from developers by open source and the public cloud, respectively, their collective output goes up. With a growing subset of this output released as open source software by entities such as Facebook, LinkedIn or Twitter who see software as a means to an end rather than an end in and of itself, it begins to create a virtuous cycle: more high quality freely available software means less reinvention of the wheel, thus begetting more open source software.

What is surprising, however, is the degree to which hardware vendors are proving to be vulnerable to the same trend. This is counterintuitive given the perceived difference between authoring code and manufacturing hardware. Aside from the difference in upfront capital required, there is the fact that manufacturing at any reasonable scale has been considered a complicated exercise, one that ideally would be outsourced to specialty vendors (i.e. Dell, HP, IBM, etc).

One of the industry’s worst kept secrets, however, has been that Google’s servers are not products of these vendors but rather machines it designed itself. While acknowledging this fact, most hardware manufacturers have dismissed its importance as either a quirk of Google’s culture or a problem unique to the only business in the world operating at that scale. In other words, Google’s roll-your-own was the exception, not the rule.

And for the most part, industry analysts tracking the hardware market implicitly validated these claims, because their numbers reflected minimal competition for traditional hardware suppliers. But as discussed in a 2011 Wired article by Bob McMillan, it was clear that the picture painted by the analyst reports was at best incomplete, because it was significantly under-reporting traction from Original Device Manufacturers (ODMs). As Andy Bechtolsheim (who co-founded Sun in 1982) put it,

“It’s hard to get those numbers because they’re not reported to IDC or any of the market firms that count servers.”

If the ODM numbers were inaccurate, then, what did the server market look like in reality? One hint arrived a year later, when Intel indirectly crowned Google as the world’s fifth largest server manufacturer. By itself, this was interesting, but could still be dismissed as an isolated trend.

Recent events, however, call that into question. Rackspace, an environment that has historically been 60% Dell and 40% HP will be in April, according to CTO John Engates:

“Basically back to our own designs because it really doesn’t make a lot of sense to put cloud customers on enterprise gear. Clouds are different animals – they are architected and built differently, customers have different expectations, and the competition is doing different things.”

While it’s true, then, that the initial customers for roll-your-own gear remain large entities like Facebook, Google or Rackspace, it is likely just a matter of time until the ODM manufacturers of the customized gear such as Quanta begin to target enterprises directly. And as for the much larger market of businesses that lack the need or wherewithal to build and design their own servers, an increasing number of them will be running on the custom ODM manufactured gear anyway in the form of public clouds.

The impacts of this shift, once so easily dismissed, are evident already. Dell is attempting to go private to retool its business away from the harsh scrutiny of public markets. VMware is attempting to enlist – among others – service providers built on Dell, HP and other major label hardware in a war on Amazon. The percentage of both revenue and profit derived from hardware at IBM have both declined over the last five years. And just yesterday, Oracle announced that its hardware revenues were down by 23 percent and that they aren’t expected to grow this year.

The simple fact is that most hardware manufacturers – like Bosworth’s database vendors – have not responded to what customers have indicated they want – in most cases because it’s at cross purposes with their margins. While they furiously add new features to hardware in an effort to justify their premium pricing, the market has accelerating its consumption of lower cost, more available alternatives in cloud or ODM gear in spite of their respective limitations. Hardware vendors are guilty, in other words, of overestimating the value of innovation at the expense of convenience. Worse, Jevon’s Paradox tells us that increased technical efficiencies – such as the public cloud’s instant provisioning – will lead to increases in consumption for the parties that provide it. Parties like Amazon.

Enterprise hardware vendors, like their software brethren, have an important question to answer: how to offer customers what they want without jeopardizing their business in the process. The market makes clear what the answer is not, at least on a volume basis: packing more features into ever higher priced hardware. The likely candidates for revenue growth moving forward instead will incorporate a combination of sources rather than relying on the hardware alone – sources such as data or network-enabled services.

Given that hardware vendors tend to be maniacally focused on the hardware, however, it remains to be seen whether they will be able to pivot culturally, embracing alternative revenue models. All the more so because the evidence is mounting that they won’t have much time to do so.

Disclosure: Amazon, Dell, and IBM are RedMonk customers. Facebook, Google, and Twitter are not.

Categories: Cloud, Open Source.