One of the most common questions regarding open source licensing today concerns trajectories. Specifically, what are the current directions of travel both for specific licenses as well as license types more broadly. Or put more simply, what licenses are projects using today, and how is that changing?
We’ve examined this data several times, most recently in this January look at the state of licensing based on Black Duck’s dataset. That data suggested major growth for permissive licenses, primarily at the expense of reciprocal alternatives. The Apache and MIT licenses, for example, were up 10% and 21% respectively, while the GPL was down 27%. All of this is on a relative share basis, of course: the “drop” doesn’t reflect relicensing of existing projects, but less usage relative to its peers.
Because many have suggested that this surge in permissive licensing is an artifact of Black Duck’s dataset – although this trend is generally validated by other sources such as GitHub, it’s worth examining the question in more detail. While it’s impossible to comprehensively examine every open source project in existence, one useful exercise would be to look at the choices being made by emerging communities. If there’s a trend towards permissive licensing, it would presumably be logical for younger communities to reflect this higher usage than older ones.
One such community with enough of a sample size to be relevant is the one currently forming around the Cloud Native Computing Foundation. Founded in 2015 with the Kubernetes project as its first asset, the Foundation has added eleven more open source projects, all of which are licensed under the same Apache 2 license. But as a successful Foundation is only a part of the broader ecosystem, the real question is what are the licensing preferences of the Cloud Native projects and products outside of the CNCF itself.
(click to embiggen)
To examine this, we researched the individual licenses for the 290+ projects and vendors from the CNCF’s Landscape Project. Before we get to the results, a few caveats.
- While there are over a hundred combined proprietary services and products on the list, they are not included in this analysis because it’s difficult to cleanly differentiate between purely proprietary services and those that are based entirely or primarily on open source products.
- This analysis pools all versions of a license together; GPLv2 and v3, for example, are simply grouped as GPL.
- Only OSI-approved licenses are included here, so the Cockroach Community license is omitted.
- Double or tri-licenses are each considered as separate entries: an MIT/MPL/GPL tri-license, for example, would result in one count for each mentioned license.
- Only core projects were considered; if an IaaS or SaaS provider, for example, is primarily closed but makes available agents as open source, they are not included here.
With that, here is the distribution of licenses across the CNCF Landscape’s open source projects.
Unsurprisingly, perhaps, given the influence of the CNCF itself, Apache strongly outperforms all other licenses, showing far greater relative adoption than it has in more generalized datasets such as the Black Duck survey. Overall in this dataset, approximately 64% of projects are covered by the Apache license. No other project has greater than a 12% share. The only other licenses above 10%, in fact, are the GPL at 12% and MIT at 11%. After that, the other projects are all 5% or less.
The strong affinity for the Apache license amongst CNCF projects is, as mentioned, undoubtedly influenced by the CNCF itself. It may also be suggestive of a more general preference for Apache as the business permissive license of choice. While large repositories such as GitHub whose volume comes from millions of individual project contributions tend to be dominated by the less restrictive MIT, it may be that Apache’s patent provisions and general business friendliness are part of the reasons for its dominance in this sample.
While Apache is clearly the dominant license of the CNCF ecosystem, however, the GPL’s second place performance here is notable, give that it bests both the MIT – the most popular open source license in datasets such as Black Duck’s – as well as the BSD. It’s also worth noting that while some of the GPL projects considered part of the CNCF Landscape are older, as with MySQL for example, many are of more recent vintage (e.g. Ansible).
In general, however, the CNCF’s landscape would appear to be tilted significantly towards permissive licenses, and that is indeed the case if we look at distribution of license types among the surveyed projects.
Amongst the projects in the CNCF’s sample, fully 80% are permissively licensed against 14% reciprocal (e.g. GPL) and 6% weak copyleft (e.g. MPL). And while non-OSI hybrid licenses are omitted here as mentioned above, it’s worth noting that CockroachDB is the only such licensed project surveyed here, so its omission is not statistically meaningful.
It’s important to acknowledge that one ecosystem should not, in any sense, be considered representative of the industry more broadly. While the sample size here is reasonable and the ecosystem spans multiple categories – and yes, the per category numbers will be looked at in future – the relative share of licenses here in the CNCF is just a datapoint. A datapoint that is consistent with the hypothesis that permissive licenses are growing in popularity, to be sure, but still just a datapoint. Further examinations of other open source communities will be necessary before we can decide whether the CNCF is the rule or the exception that proves it.
In the meantime, however, it’s interesting to see a new and large community systemically favor a license that imposes so few restrictions on its users and customers. The largest pure play open source business in the world, Red Hat, has primarily been fueled by one GPL-licensed asset in RHEL, but it will be interesting for both the company and the wider cloud native ecosystem to see whether or not a permissively licensed asset can power a business of similar or even greater size.
Disclosure: The Linux Foundation, CNCF parent, is a RedMonk customer, as is Red Hat.
Credit: Rachel Stephens assisted in the information collection and analysis for this piece.
Dan Kohn says:
September 23, 2017 at 9:59 am
Very nice analysis. You might want to link back to our explanation of why CNCF recommends the Apache 2.0 license, especially since we even reference one of *your* earlier pieces:
We have a 1.0 interactive version of the cloud native landscape coming soon with a YAML file linking to the repo of every project, which will make this kind of analysis easier. It would also be interesting to do an analysis of languages (presumably showing the rise of Go for cloud computing).