To many in the technology industry, the dominance of Decentralized Version Control Systems (DVCS) generally and Git specifically is taken as a given. Whether it’s consumed as a product (e.g. GitHub Enterprise/Stash), service (Bitbucket, GitHub) or base project, Git is the de facto winner in the DVCS category, a category which has taken considerable share from its centralized alternatives over the past few years. With macro trends fueling further adoption, it’s natural to expect that the ascent of Git would continue unimpeded.
One datapoint which has proven useful for assessing the relative performance of version control systems is Open Hub (formerly Ohloh)’s repository data. Built to index public repositories, it gives us insight into the respective usage at least within its broad dataset. In 2010 when we first examined its data, Open Hub was crawling some 238,000 projects, and Git managed just 11% of them. For this year’s snapshot, that number has swelled to over 674,000 – or close to 3X as many. And Git’s playing a much more significant role today than it did then.
Before we get into the findings, more details on the source and issues.
The data in this chart was taken from snapshots of the Open Hub data exposed here.
Objections & Responses
- “Open Hub data cannot be considered representative of the wider distribution of version control systems“: This is true, and no claims are made here otherwise. While it necessarily omits enterprise adoption, however, it is believed here that Open Hub’s dataset is more likely to be predictive moving forward than a wider sample.
- “Many of the projects Open Hub surveys are dormant“: This is probably true. But even granting a sizable number of dormant projects, it’s expected that these will be offset by a sizable influx of new projects.
- “Open Hub’s sampling has evolved over the years, and now includes repositories and forges it did not previously“: Also true. It also, by definition, includes new projects over time. When we first examined the data, Open Hub surveyed less than 300,000 projects. Today it’s over 600,000. This is a natural evolution of the survey population, one that’s inclusive of evolving developer behaviors.
With those caveats in mind, let’s start with the big picture. The following chart depicts the total share of repositories attributable to centralized (CVS/Subversion) and distributed (Bazaar/Git/Mercurial) systems.
Even over a brief three year period (we lack data for 2011, and have thus omitted 2010 for continuity’s sake) it’s clear that DVCS systems have made substantial inroads. DVCS may not be quite as dominant as is commonly assumed, but it’s close to managing one in two projects in the world. When considering the inertial effects operating against DVCS, this traction is impressive. In spite of the fact that it can be difficult even for excellent developers to shift their mental model from centralized to decentralized, that version control systems are not typically the priority of other infrastructure elements, that the risks associated with moving from one system to another are non-trivial, DVCS has clearly established itself as a popular, mainstream option. Close observation of the above chart, however, reveals a slight hiccup in adoption numbers which we’ll explore in more detail shortly.
In the meantime, let’s isolate the specific changes per project between our 2014 snapshot and the 2010 equivalent. How has their relative share changed?
As might be predicted, comparing 2010 to 2014, Git is the clear winner. The project with the idiosyncratic syntax made substantial gains (25.92%) partially at the expense of Subversion (-12.02%) but more CVS (-16.64%). Just as clearly, Git is the flag bearer for DVCS more broadly, as other decentralized version control systems in Bazaar and Mercurial showed only modest improvement over that span – 1.33% and 1.41% respectively. The takeaways, then, from this span are first that DVCS is a legitimate first class citizen and second that Git is the most popular option in that category.
What about the past year, however? Has Git continued on its growth trajectory?
The short answer is no. With this chart, it’s very important to note the scale of the Y axis: the changes reflected here are comparatively minimal, which is to be expected over the brief span of one year. That being said, it’s interesting to observe that Subversion shows a minor bounce (1.28%), while Git (-1.17%) took a correspondingly minor step back. Bazaar and CVS were down negligible amounts over the same span, while Mercurial was ever so slightly up.
Neither quantitative nor qualitative evidence supports the idea that Git adoption is stalled, nor that Subversion is poised for a major comeback. Wider market product trends, if anything, contradict the above, and suggest that the most likely explanation for the delta in Open Hub’s numbers is the addition of major new centrally managed codebases to Open Hub’s index.
It does serve as a reminder, however, that as much as the industry takes it for granted that Git is the de facto standard for version control systems, a sizable volume of projects have yet to migrate to a decentralized system of any kind. The implications for this are many. For service providers who are Git-centric, it may be worth considering creating bridges for users on other systems or even offering assistance in VCS migrations. For DVCS providers, the above may be superficially discouraging, but in reality indicates that the market opportunity is even wider than commonly assumed. And for users, it means that those still on centralized systems should consider migrating to decentralized alternatives, but by no means are condemned to the laggard category.
While it is thus assumed here, however, that the step back for Git is an artifact, it will be interesting to watch the growth of the platform over the next year. One year’s lack of growth is easily dismissed as an anomaly; a second year would be more indicative of a pattern. It will be interesting to see what the 2015 snapshot tells us.
Disclosure: Black Duck, the parent company of Open Hub, has been a RedMonk customer but is not currently.