In one respect, our attempts at RedMonk to quantify developer behaviors are no different than our qualitative research: we go to where the developers are. But for our purposes, not all developer communities are created equal.
Whether we’re examining GitHub, Hacker News, Ohloh, Popcon, Stack Overflow or another developer environment, we’re most interested in communities we believe to be predictive. Surveying conservative enterprise buyers, for example, recalls the (apparently apocryphal) Henry Ford maxim: “If I’d asked my customers what they wanted, they’d have said a faster horse.”
That being said, it’s useful to understand what the differences are – if any – between more aggressive and more conservative technical communities. To explore a more representative sample, rather than the self-selected audiences we typically focus on, I used Google Community Surveys.
The screening question for our audience was simple: “I am a software developer.” Those who answered “Yes” were asked a second question: “I use a Distributed Version Control System (DVCS) such as Git rather than a centralized alternative such as CVS or Subversion.” The question’s neutrality is arguable, and it obviously doesn’t allow side by side usage, but the primary intent is to elicit preferences.
Previously, an examination of Ohloh’s repositories indicated that amongst that sample of open source projects, 31% were using some form of decentralized version control systems (DVCS) against 67% employing centralized alternatives (~2% were undetermined). The question was how this usage pattern compared against a more random sample of the developer population at large.
The answer was, in general, that it appeared generally inline with average developer preferences.
Those who answered both questions yielded a sample size of several hundred software developer respondents. Of these, over a third (35%) reported using DVCS against 65% leveraging centralized systems such as Subversion. It’s somewhat interesting that a random sample is more aggresive in their usage of DVCS tools than a presumably technically sophisticated audience of open source developers, but one likely explanation is that Ohloh’s surveyed repositories include a substantial number of artifact projects for whom the costs of migrating to DVCS are not justified, and thus the actual traction for decentralized tools in that dataset is underreported.
Whatever the explanation, the fact that the DVCS traction reported previously in our specialized source is corroborated by a random sample is interesting. If we assume, as speculated above, that DVCS traction is systemically underreported, this may support the hypothesis that the developer communities we survey are in fact predictive of general adoption.
In addition to providing general answers, the Google survey included some interesting demographic data. Inferring age, gender, income and so on, Google breaks down respondents by category. Some, such as distribution of answers by income, were of insufficient sample sizes to even be worth reporting. And while Age and Gender are not technically statistically significant, the number of respondents (several dozen to multiple hundred, depending) was enough to make the ansewrs interesting.
Amongst our population, here is the breakdown of DVCS usage by Age.
DVCS usage trends for age were generally predictable: younger developers were more likely (by almost 10%) to use DVCS. Interestingly, however, the DVCS usage proved most popular by the elder segment of the younger group: respondents 35-44 were the most likely to use DVCS out of any age group (41.3%). DVCS was next most popular with developers aged 18-24 (38%) and was least popular with developers 65 and over (24.1%).
Again, the numbers must be taken with a grain of salt because of the lack of technical statistical significance, but they’re interesting nevertheless. Gender was even more so.
For reasons that are unclear, in this sample men were almost 10% more likely than women to use DVCS systems. As mentioned, the results cannot be projected to a general population because of the sample size, but the numbers in each case were better than a hundred. No obvious explanations present themselves for these results, presuming that they reflect an actual trend and are not the product of mere sample size issues.
In any event, this data – small sample size caveats and all – suggests that DVCS adoption is not an artifact of the predictive communities we focus on, and is indeed a real trend amongst the larger developer population. Vendors would do well to plan accordingly.