tecosystems

The State of Open Source Licensing

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

It’s become common in the technology industry today to say that open source has gone mainstream. Evidence for this assertion abounds. From the multiplying number of projects to accelerating participation from vendors who once were dedicated to protecting their source code, open source is more accepted and more of a default by the day. What the industry doesn’t talk as much about is what open source means.

While we tend to talk about open source as a singular, cohesive category for the sake of convenience, this is obviously an oversimplification. While the fundamental concept of making source code open is common, the rights, responsibilities and privileges conferred thereby are interpreted very differently from community to community.

Copyleft licenses, for example, of which the GPL is the most notable variant, are committed to the freedom of the source code. Code governed by a copyleft license asks for reciprocity from consumers; if changes to the code base are made and distributed (we’ll come back to that word), they must be released and shared under the original terms. Permissive licenses, on the other hand, are built around freedom for the developer: permissively licensed assets impose few if any restrictions on downstream users, and require no such reciprocity. Both communities are strongly committed to freedom; the difference lies in what, precisely, is kept free.

For the better part of the last decade, with better than two thirds of open source projects licensed under the GPL, one might reasonably assume that copyleft was the standard or default, and anything weaker the exception. For years now, however, the dominance of that style of license has been steadily eroding, giving way to licenses at the opposite, permissive end of the spectrum.

Indeed, if we contrast each license’s share of the repositories surveyed by Black Duck this month versus January 2010, the shift is quite apparent.


(Click to embiggen)

In Black Duck’s sample, the most popular variant of the GPL – version 2 – is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%. What this means is that over the course of a seven year period, the GPLv2 has gone from being roughly equal in popularity to the next nine licenses combined to 10% out of first place.

All of which suggests that if we generally meant copyleft when we were talking about open source in 2007, we typically mean permissive when we discuss it today. This is not to be dismissive of the importance of copyleft licenses in 2017, any more than acknowledging the reverse a decade ago would have been insulting to permissive license advocates. It’s simply a recognition that, by most observableĀ metrics, we’re now in the permissive licensing era.

If we look closely, however, there is more to be extracted from this chart than verifications of a macro-level shift to permissive licenses.

  • Consolidation:
    The above chart is the Top 10 licenses from 2010. All but three have declined in popularity. The GPL took the biggest hit and Apache and MIT were the primary beneficiaries, as discussed, but it’s notable that a relatively popular and well known permissive license like the BSD would not only not benefit from the trend towards permissive licensing, but actually decline over that span. It’s true that the permissive ISC license, used by OpenBSD and NPM among other projects, worked its way into Black Duck’s 2017 Top 10 at 4%, but that still places it behind BSD. Given the degree to which Apache and MIT have separated themselves from other licenses generally and their permissive counterparts specifically, it will be interesting to track whether or not we begin to see consolidation around a smaller number of licenses.
  • Binary Choices:
    Historically, we’ve had three major licensing types: Copyleft, Permissive and a middle ground of weakened-copyleft, less-permissive alternatives. If we aggregate Black Duck’s numbers for LGPLv2.1 (4), LGPLv3 (2), EPL (1), MPLv1.1 (<1), CDDL (<1) and CDDLv1.1 (<1) you get a total share for the category of 7-8%, which begs the question as to whether one of the unacknowledged casualties of the permissive licensing surge is weakened copyleft-licenses. Most of the coverage of licensing focuses on declines in the GPL, but unmentioned is the invisibility of licenses such as the LGPL – which, to be fair, the FSF itself discourages use of. Increasingly, however, the data suggests the choice is a binary copyleft or not.
  • No License:
    As much as the focus to this point has been on specific licenses and license choices, however, the unfortunate reality is that increasingly open source repositories are populated not by code carrying one of the aforementioned types of licenses, but no license at all. See this chart from GitHub.

    To quote from their piece:

    If you look at this graph of licensed repositories over time, you’ll notice that the percentage of licensed repositories has been decreasing, hovering around 20% throughout GitHub’s history (about 30% if you include forked repositories).

    There are many explanations for this, ranging from developer indifference to vendor apathy, but as an unlicensed project is not an open source project, this remains a problematic trend.

  • AGPL:
    One question that we field regularly is whether we see potential for the AGPL to make a bid for popularity. For the unfamiliar, the AGPL is a copyleft license like the GPL that unlike the GPL considers hosting an application online to be distribution, thereby triggering the reciprocal requirements – you can read more about it here. There are various potential justifications: a desire to protect a commercial OSS business from being co-opted by large, well capitalized cloud players; pressure from venture capitalists not unrelated to the former; or perhaps just a philosophical belief in the GPL coupled with a recognition that the original license is effectively neutered in a networked world. While on paper the arguments in favor of the license are defensible, and there are some high profile projects that leverage it, my response typically is that there is not yet any qualitative or quantitative evidence to suggest that the AGPL is seeing any surge of adoption as a response to the growth of the cloud. It is rarely if ever the subject of an inquiry with us or the choice of projects we speak with, and quantitatively it’s less than 1% of Black Duck projects and a tick over 1% of GitHub projects during the last public survey. That being said, it’s important to note that license adoption is very much a lagging indicator, as there is high friction both to changing a given project’s license and risks to being incompatible with adjacent projects.

3 comments

  1. Thanks for the update, as always!

    I think when discussing whether the GPL is declining, it would be more correct to look at the union of GPLv2 and v3 together. After all, the GPLv3 was intended to more or less replace v2, so you would expect the latter to decline.

    Of course, even when summing the two together, there is a very clear decline, and also a several percentage point lead for MIT.

    While I subscribe to the belief that there’s simply a trend to more permissive licensing in general (due to a combination of various reasons), one question to research would be whether it is the case that the big rewrite that was the GPLv3 has perhaps failed as a license? It is longer than it’s already long predecessor, seems to invent new legal language solely for itself, and includes what an engineer would consider “cruft”, such as a specific clause to deal with a Microsoft-Novell licensing agreement. It’s possible that engineers and lawyers alike simply don’t want to deal with the license as a license?

    FWIW, from my personal experience, the most common reason to avoid GPLv3 are the DRM prohibitions.

  2. Thanks for the update. I’ll be pedantic for a moment. For all those repositories on Github without a license, they are not open source repositories (as you later point out). Github claiming an equality between “public viewable” (their requirement for a free repository) and “open source” is disingenuous at best.

    I worry about the race to ASLv2. Henrik’s analysis of vibrant projects 6-7 years ago, found that the nine largest, most active projects were wrapped in a foundation. The tenth largest was an order of magnitude smaller still inside a company. He was careful to list his assumptions and not claim causality.

    Looking back, however, we also see 8 of the 9 are reciprocal licensed (6 GPL, 1 Eclipse, 1 Mozilla). Only the Apache group was ASLv2 licensed. In discussions with ASF folks as to why we had one differentiated license data point, we thought that the Apache incubation process may force a similar behaviour on the development team for project growth that is otherwise forced by the license.

    Lots of folks in the past six years have rushed to adopt ASLv2 as “the more business friendly license” without rationale or proof. I’m not sure how we measure it (yet), but I think the belief needs a little more data to support it.

    1. It would be interesting (albeit complex at best), try to evaluate how changed the industrial relevance of above mentioned licenses. As Stephen (and formerly Henrik’s) highlighted, how a community is managed is important too, though. Correlating community governance models, software licenses and the industrial impact would tell us everything we always wanted to know about Open Source Software Licenses’ adoption.

Leave a Reply

Your email address will not be published. Required fields are marked *