tecosystems

DVCS and Git Usage in 2013

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

For all that it feels as if Git has exploded on to the scene, the advancement of distributed version control systems (DVCS) has in fact been laborious and slow. As far back as 2007, we were advocating on behalf of systems such as Bazaar, Mercurial and Git. While it was not yet possible to foresee which of the many DVCS options would emerge as the de facto standard, the evidence suggested that for sophisticated, progressive developers, distributed development was a fundamental game changer. If DVCS was the future, however, it was certainly unevenly distributed.

Three years later in 2010, when we first examined Ohloh’s data on repository distribution across 238,000 projects, Git was just emerging as the DVCS option of choice for developer populations. It was in use at just under 11% of the surveyed projects, easily bettering Mercurial’s 1.25% and Bazaar’s 0.59%. But while Git was outperforming its distributed counterparts, it was distinctly less competitive with centralized alternatives. The venerable CVS was in use at twice as many projects – 26% – and Subversion checked in with a dominating 60% share of all projects. And this was, remember, Ohloh’s survey of open source projects: one that theoretically should have favored DVCS more than a survey of enterprise VCS repositories might have.

Two years later, however, the tide had begun to turn. Git (28%) more than doubled its traction, while CVS usage (12%) was cut in half. Even with effectively no growth from either Bazaar nor Mercurial (~2% combined), then, the standard bearer of distributed version control was carrying the approach forward at the expense of older centralized systems.

Three years from our original checkpoint, and the picture is even clearer. Having examined Ohloh’s repository data and compared it to prior years, a few conclusions can be made. Before we get to that, the source and issues.

Source

The data in this chart was taken from snapshots of the Ohloh data exposed here.

Objections & Responses

  • Ohloh data cannot be considered representative of the wider distribution of version control systems“: This is true, and no claims are made here otherwise. While it necessarily omits enterprise adoption, however, it is believed here that Ohloh’s dataset is more likely to be predictive moving forward than a wider sample.
  • Many of the projects Ohloh surveys are dormant“: This is probably true. But even granting a sizable number of dormant projects, it’s expected that these will be offset by a sizable influx of new projects.
  • Ohloh’s sampling has evolved over the years, and now includes repositories and forges it did not previously“: Also true. It also, by definition, includes new projects over time. When we first examined the data, Ohloh surveyed less than 300,000 projects. Today it’s over 600,000. This is a natural evolution of the survey population, one that’s inclusive of evolving developer behaviors.

With those caveats in mind, here is a chart depicting the total share of repositories attributable to centralized (CVS/Subversion) and distributed (Bazaar/Git/Mercurial).

The trendline here is clear: after years of languishing as a fringe technology, distributed version control systems are on a clear path towards a majority share. From less than 14% of the surveyed repositories just three years ago to almost half today, the growth of DVCS is clear and unimpeded. The qualitative evidence supports this conclusion as well: few vendors today with VCS integration points fail to include, if not standardize on, distributed tools broadly and Git more specifically.

On the latter point, while the growth of DVCS in general is clear, many are curious as to Git’s part in that. Is it the clear default it was a year ago, or have alternative projects benefited from the rising tide of distributed version control?

The short answer is no, they have not.

The focus of this chart tends to be either the growth of Git or the declines in both CVS and Subversion, but the continuing lack of traction for both Bazaar and Mercurial is notable. Git’s dominance, whether it’s because of its power and speed or in spite of its idiosyncratic syntax, is effectively locked in at this point. There are no factors in either in this data or anecdotal evidence to suggest a simmering interest in Git alternatives that could fuel a comeback. While it’s possible, then, that Bazaar and Mercurial could become greater factors in the version control space moving forward, it’s not likely.

Consider the following chart that depicts the overall gains or losses in share from 2010 to 2013.

In a mere three years, Git is up almost 30% (27.09%) while Bazaar and Mercurial are up 1.41% and 0.75%, respectively. But at least they’re positive over that timeframe, slight though their gains may have been. CVS usage is down almost 16% since 2010. Subversion’s case is even more interesting. From 2010 to 2012, it only declined 4.3%. In the last year, however, it began to slip much more dramatically, and since 2010 it is off 13%.

The data seems clear, but a few questions remain.

First, and maybe the most interesting, not why has this happened, but why has it taken so long? From open source databases to runtime fragmentation to cloud infrastructures, the world has been completely turned over in a far shorter span of time than it’s taken DVCS to gain even near equivalent acceptance amongst developer populations. To answer this, many point to the friction of migrating a given codebase from one version control tool to another, or the increasingly complicated integrations of existing VCS tools into build chains. Both are likely contributors, but my suspicion is more simple: developers didn’t drive the adoption of these tools as quickly as a few others because they were harder to master. MySQL, remember, is in most respects a simpler product to adopt for developers accustomed to the traditional, full function commercial databases. DVCS, on the other hand, not only requires that developers learn an entirely new tool and syntax, but that they change the way they think. This philisophical difference in approach has caused even high profile developers to struggle with its implications, and thus be more slow to adopt and propagate the technologies. This, more than any other factor including enterprise conservativsm, is likely why it’s taken the better part of a decade for DVCS to really hit its stride.

The second, and more important question is, what does the above mean to me? That depends, clearly, on what your position is.

  • If you’re using Git already and/or advocating its usage: The above should either validate your usage or arguments in favor of usage.
  • If you’re using Bazaar, Mercurial or another DVCS: Nothing in the above precludes you from continued usage. There may be advantages to migrating to Git, including relative differences in developer familiarity with the toolsets, but the fundamental importance of Git is its enablement of distributed development, which these tools allow. You might consider switching then, but it’s certainly not an immediate necessity.
  • If you’re using CVS, Subversion or a centralized alternative: Much depends on your usage, of course, but the benefits of DVCS over a centralized alternative are substantial. The speed of distributed development, which can occur in parallel, is likely to greatly exceed that permitted by centralized alternatives, which operate on a serial model. With Git becoming a de facto developer standard, as well, reliance on non-distributed tools could impact both hiring and retention. There are exceptions, of course, but the recommendation here is to evaluate Git as a future path for all of the above reasons.

Overall, these charts will surprise only the most conservative of enterprises. The rise of Git has been well chronicled, and the number of large projects such as Eclipse that have transitioned from centralized to distributed models has elevated awareness of the trend. That DVCS is increasingly the preference and Git the tool of choice is, in many quarters, accepted as a given. The data above merely tests this assumption quantitatively and concludes that it is, according to this sample at least, justified.