tecosystems

Forking, The Future of Open Source, and Github

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

I cannot help you with your question, sir, for I do not understand it. It is a wrong question, sir.” – Susanna Clarke, Jonathan Strange & Mr. Norrell

Being what I should have replied to Mike Milinkovich last week, but didn’t think to. But let me back up.

Last Wednesday, at the kind invitation of the folks from Eclipse, I had the opportunity to sit with more august company – Justin Erenkrantz (Apache), Mårten Mickos (Eucalyptus), and Jason van Zyl (Maven/Sonatype) – on a panel charged with debating the future of open source. Among the questions posed to us was this: is the future of open source going to be based on communities such as Apache and Eclipse or will it be based on companies that sell open source?

My reply? Neither. It’s Github.

Intending no disrespect to either category, of course. Communities such as Apache, Eclipse and Mozilla are and will remain massive centers of gravity for open source projects. And commercial vendors – yes, even those that dare practice “open core” – will continue to inject revenue into the larger open source ecosystem, underwriting substantial portions of the costs of producing free and open source software as Justin said. But when we talk about foundations and vendors jointly and the mechanics of open source, what we’re really talking about is the past and the present. Not the future.

Foundations and vendors, irrespective of whether or not they’ve transitioned from centralized to decentralized version control systems, are essentially command and control development models. Developers, corporations and other contributors build towards a single project in a defined hierarchy, which a given potential contributor may or may not have the right to commit to. This single project is then either procured directly, or more typically consumed downstream via a community (e.g. Debian, Fedora) or commercial distribution (e.g. RHEL, Cloudera, etc). In terms of its development paradigm, open source development happens to look a lot like proprietary development. It just happens to occur in the open. The future, meanwhile, is almost certainly going to look different than the present. In part because the tooling encourages it.

We’ve been tracking decentralized version control tools for four or five years now. Three years ago, I was openly perplexed about the continuing lack of interest in the subject. In retrospect, however, the slow ascendancy of distributed version control systems (DVCS) should have been predicted. Not because we’re usually ahead of the curve when it comes to adoption; I’ve been waiting for NoSQL for five years. The lag in DVCS adoption should have been anticipated rather because it takes people – even (especially?) the really smart ones – to come around to the idea of decentralization. Witness Joel’s conversion:

I studied, and studied, and finally figured something out. Which I want to share with you.

With distributed version control, the distributed part is actually not the most interesting part.

The interesting part is that these systems think in terms of changes, not in terms of versions.

That’s a very zen-like thing to say, I know. Traditional version control thinks: OK, I have version 1. And now I have version 2. And now I have version 3.

And distributed version control thinks, I had nothing. And then I got these changes. And then I got these other changes.

It’s a different Program Model, so the user model has to change.

In Subversion, you might think, “bring my version up to date with the main version” or “go back to the previous version.”

In Mercurial, you think, “get me Jacob’s change set” or “let’s just forget that change set.”

If you come at Mercurial with a Subversion mindset, things will almost work, but when they don’t, you’ll be confused, unhappy, and unsuccessful, and you’ll hate Mercurial.

Whereas if you free your mind and reimagine version control, and grok the zen of the difference between thinking about managing the versions vs. thinking about managing the changes, you’ll become enlightened and happy and realize that this is the way version control was meant to work.

When we concluded that version control systems such as Git or Mercurial would become popular, if not the default, we had had no such epiphanies, no similar flashes of insight. It was simply brute force observation: whatever the reasoning, more and more developers, projects and firms were transitioning away from centralized to decentralized. And happier for it. The trendline was clear, which is why we weren’t exactly going out on a limb predicting the ascension of Git, Mercurial and their brethren.

What was less obvious was the profound, outsized impact decentralized version control would have on the future of open source, and thus development in general.

Open source development traditionally has been, for better and for worse, a social activity. With the accelerating adoption of tools like hosted Git, it’s even more so. Why? As counterintuitive as it might seem, the ability to fork.

Forking has historically an option of last resort; how developers proceed when all other remedies are exhausted. First because it’s potentially damaging to the originating project, but more because it’s a significant logistical challenge. As Brian Aker put it:

Forking software over small changes is for the most part unviable because of the cost of keeping a fork of the software up to date, but it is not impossible.

What if the costs of forking became negligible, for the originating project and those who wish to fork a project? What if tools such as decentralized version control made it possible to work on projects not in centralized check-in, check-out fashion, but individually?

As Brian discussed at OSCON back in 2008, all of a sudden forks become trivial; both to execute, and potentially to reintegrate. On Github, forking is quite literally pushbutton. In terms of their ability to permit greater creativity, forks cease being a cancer and become a cure. Sometimes, anyway. Because while it’s simple to fork, it’s not much harder to reintegrate. Losing the shackles of centralized development accelerates development and increases creativity. Add in a centrally hosted network model, and everything from discovery to social features become possible. As jwz once said, “these days, almost all software is social software.” Why should version control be any different?

The future to me, then, looks a lot more like Github than it does a foundation or vendor. It is becoming the breeding ground for thousands of innovations that may aspire to grow up to be full fledged foundation projects, commercial products, or both. So much so that a number of people, like Phil Wilson, worry about what would happen if Github went away. As they should: look at some of the projects hosted there.

Because while Github will never replace the foundations, let alone the vendors, it will increasingly become the foundations upon which many of their component projects are built. If you haven’t been paying attention to the service, then, I’d suggest giving it a look. It’s the shape of things to come.

Disclosure: Apache and Eclipse are RedMonk customers; Github and Mozilla are not.