In December of 2010, Drew Conway decided to explore in quantitative fashion one of the more popular and contentious subjects debated by developers: the relative popularity of programming languages. To do this, he compared the traction of the languages on both GitHub and StackOverflow, communities that are both popular with developers and yet have somewhat distinct communities. GitHub’s rankings are based on GitHub’s own stacking of the individual languages, while the languages on StackOverflow are ranked according to the volume of tags associated with each language.
The result was a plot that featured a high correlation; the popularity on GitHub tended to correlate with the popularity on StackOverflow. Ten months later, we repeated this analysis, and again in February. These analyses have proven very popular with developers; the latter post was linked to on Twitter nearly six hundred times.
The truth, however, is that with respect to language popularity, very little changes on a month to month basis. While we do snapshot the necessary data monthly in the event that we require it for more detailed analysis, then, the more interesting insights come when we can examine the data over longer periods of time. Which, having been collecting data over a period of years, we are now able to do.
Here, to begin with, is an up-to-date plot of programming language popularity (click the image for the full size version).
With more languages being tracked than previously, it can be difficult to process this plot effectively. As has traditionally been the case, rough groupings or tiers of languages are apparent. And if one compares this plot to previous iterations, it’s possible to detect progress amongst specific languages. Scala, as one example, seems to be gradually progressing to the top of the second language tier.
But because this plot can be difficult to decipher by itself, we’ve extracted a list of the Top 20 programming languages by popularity here.
- Visual Basic
But while there may be a few surprises on this list – the continued traction of Java, as an example, is unexpected for some – by and large this list seems like nothing more or less than a reasonable representation of programming languages in use today. It is an inclusive list, from compiled to interpreted and everything in between, and thus more evidence of the runtime fragmentation that has been rampant for several years [coverage].
But what if we compare this September 2012 to Drew’s original analysis in December of 2010, just shy of three years ago? What has changed with these languages overall in three years?
- Clojure -6 (Dropped out of the Top 20)
- Emacs Lisp -6 (Dropped out of the Top 20)
- ActionScript -4
- Lua -3 (Dropped out of the Top 20)
- Perl -3
- C -2
- R -2
- PHP -1
- Python -1
- C++ 0
- Groovy 0
- Objective-C 0
- Ruby 0
- Shell 0
- Haskell 1
- Scala 1
- ASP 2
- Java 2
- C# 5
- Visual Basic 5 (Added to the Top 20)
- Assembly 6 (Added to the Top 20)
- CoffeeScript 18 (Added to the Top 20)
Outside of movement in the Top 20, there have been questions recently around Go, a language introduced late in 2009. Apcera’s Derek Collison, in particular, is bullish on the language, saying:
Prediction: Go will become the dominant language for systems work in IaaS, Orchestration, and PaaS in 24 months. #golang
— Derek Collison (@derekcollison) September 11, 2012
The numbers are not quite so bullish, but do provide some grounds for optimism for advocates of the language. Our rankings have Go jumping from #32 in 2010 to #30 today, a number that sounds modest but means that in that time it has improved more in popularity than Scala or Haskell and as much as Java, at least from a rankings standpoint (obviously growth becomes more difficult the more popular the language becomes). Second, there’s its age. At a bit less than three years of age, Go’s position as a solidly second tier language is enviable, given the fact that there are much older languages like Smalltalk that have yet to break that barrier.
Ultimately, these rankings are intended to serve as a datapoint, a snapshot of traction within two particular communities that happen to be substantial centers of gravity from a development perspective. While not strictly representative, they do confirm one of the more important developer trends observed within the past decade: fragmentation. As with so many areas of technology today, the programming language landscape is wildly diverse, with multiple languages being employed simultaneously by individual developers, often on the same project. Whatever your feelings on the specifics of the rankings above or the merits of the languages themselves, be aware that all of the listed languages are present, and present in volume, within today’s developer populations.