One of the more common areas of inquiry around open source for us at RedMonk concerns project contributors. Who is contributing to what project? What are the relative rates of contribution from contributor to contributor? How do the contributions to a project compare to contributions from competitive projects?
In many cases, this is a difficult if not impossible question to answer because the identities and affiliations of project contributors are obscure, whether by design or simply because developers prefer the individual identities independent from their employer. But just because a question is difficult to answer and may return imperfect results does not mean that it’s not worth asking.
With my colleague having updated some of the basic community metrics from Hacker News and Stack Overflow that we track on the configuration management space, we’ll take a look here at the top 5 contributors by domain to the Ansible, Chef, Puppet and SaltStack projects. CFEngine is omitted here because their GitHub repository for Version 3 is not backwards compatible with Version 2 and thus doesn’t accurately represent total project traction.
To set the context, however, here are a few charts comparing the surveyed projects to one another. First, we’ll look at the number of accepted pull requests across the four projects.
(Click to embiggen any of the below charts)
As noted by Donnie, Salt is disproportionately represented because pull requests are the sole contribution mechanism, but as Ansible’s Greg DeKoenigsberg observes it’s important to caveat both this metric and the following two by noting that Ansible and Salt are GitHub native and thus can be expected to outperform in that context.
With that said, here is how the projects compare relative to one another in terms of GitHub stars accumulated.
Even with the aforementioned qualifiers attached, Ansible’s performance here is notable. This trend is not new; the project has been popular on GitHub at least since 2013 when we added it to the projects in this space we track. Ansible is also the leader on GitHub in the number of forks, although its lead is less substantial in that category.
As noted, the fact that Ansible and Salt are the leaders in these categories is unsurprising given their relationship with GitHub the platform. But these metrics are, as mentioned, opaque. Who, precisely, is contributing to the projects?
To explore this question we turned to the actual Git commit logs for each project. More specifically, we extracted the email addresses per commit, and then looked at the contributions on a per domain basis. The following charts look at the top five contributors to each by domain. One quick caveat: no edits, corrections or consolidation has been made to these charts, so if there are multiple domains representing one company (e.g. ansible.com and ansibleworks.com) they are not consolidated here. We’ll revisit this approach in future, but the results are presented here without alteration, so it’s important to take that into consideration when evaluating relative contribution levels.
The dominant contributing domain to the Ansible project is gmail.com, with a substantial majority coming from there. This potentially reflects the project’s age and permissive policies with respect to contributions, presumably regarding project contributions from internal employees as well. The strong presence here from Fedora is no surprise, given both Red Hat’s ties to the Ansible project as well as Fedora’s heavy usage of it. The remainder of the contributions are attributable to Ansible-related domains: ansible.com/ansibleworks.com and sngx.net (James Cammarata, Ansible employee).
As might be expected for a project of Chef’s age and maturity, the majority of contributions to the project originate from Chef-owned domains (chef.io and opscode.com). Independent gmail.com addresses are a distant second place (~9X less), but it was interesting to see India-based Clogeny check in fourth place. Fifth place is rounded out by magoazul.com, which appears to be the personal domain of Matthew Kent, currently a Basecamp (née 37Signals) employee.
Much like Chef, the other elder stateman of this category, Puppet contributions are dominated by puppetlabs.com domains. There are 14X more contributions from that domain than cloudsmith.com, itself the URL of a company since acquired by Puppet. Third place, for its part, is the personal domain of Adrien Thebo, currently a Puppet Labs employee. After the fourth-largest contributor, contributions from gmail addresses, comes Days of Wonder, a board game manufacturer (e.g. Ticket to Ride). This seems to largely be the work of developer Brice Figureau, a major contributor to the project.
Following in Ansible’s footsteps, the younger Salt project is overwhelmingly composed of contributions from gmail.com addresses. The number three and four contributing domains – saltstack.com and eseth.com – effectively reflect company contributions, as Seth House is a SaltStack employee. The number two contributing domain, however, belongs to Pedro Algarvio, a Python developer with no formal affiliation with the project as documented by Matt Asay. The fifth largest contributing domain, meanwhile, is clemson.edu, no surprise given that university’s public usage of Salt.
In general, the findings from this project are mostly unsurprising. Older, more mature projects skew towards contributions from employees, while the younger would-be disruptors may or may not feature similar percentages of employee contributions, but if so are at least less formal in their contribution policies. It will be interesting to see whether or not Ansible or Salt’s contribution policies become more formal and employer-centric over time. It will likewise be necessary to monitor whether or how these relative contribution levels evolve; does it become more difficult for individual developers to rank amongst the top contributors? Do we see influxes of new contributions as the relative dynamics of project adoption shift? Can we expect changes in terms of the internal to external contribution ratios?
Either way, it’s interesting to go beyond strict contribution metrics to get a closer look at who the contributors are and where they’re coming from, even if it’s difficult or impossible to discover in some cases.
Disclosure: Ansible and Chef are RedMonk clients, while Puppet and Salt are not.
Michael DeHaan says:
April 2, 2015 at 5:46 pm
May be worth noting that while I don’t work for Ansible anymore, I have over two thousand commits under my gmail account . Ansible did not have a policy around what domain you had to commit with, but that bar will be about 3.5 wider than the sngx one 🙂 So there’s a LOT of corporate commits there too, the data is just tricky.
So it’s really a graph of whether a company set a domain-commit policy or not. We hadn’t.
Innovation?! The Feds Are In On IT! Apprenda Marketwatch - Apprenda says:
April 3, 2015 at 11:46 am
[…] Who’s Contributing to Configuration Management Projects? “One of the more common areas of inquiry around open source for us at RedMonk concerns project contributors. Who is contributing to what project? What are the relative rates of contribution from contributor to contributor? How do the contributions to a project compare to contributions from competitive projects? In many cases, this is a difficult if not impossible question to answer because the identities and affiliations of project contributors are obscure, whether by design or simply because developers prefer the individual identities independent from their employer. But just because a question is difficult to answer and may return imperfect results does not mean that it’s not worth asking. …” Via Stephen O’Grady, RedMonk […]