This iteration of the RedMonk Programming Language Rankings is brought to you by IBM. From Java to Node.js, IBM remains at the forefront of open source innovation. Try our code patterns to help build the future of open source.
With the second quarter looming, it’s time for us to drop our first quarter bi-annual Programming Language rankings. As always, these are a continuation of the work originally performed by Drew Conway and John Myles White late in 2010. While the specific means of collection has changed, the basic process remains the same: we extract language rankings from GitHub and Stack Overflow, and combine them for a ranking that attempts to reflect both code (GitHub) and discussion (Stack Overflow) traction. The idea is not to offer a statistically valid representation of current usage, but rather to correlate language discussion and usage in an effort to extract insights into potential future adoption trends.
Our Current Process
The data source used for the GitHub portion of the analysis is the GitHub Archive. We query languages by pull request in a manner similar to the one GitHub used to assemble the 2016 State of the Octoverse. Our query is designed to be as comparable as possible to the previous process.
- Language is based on the base repository language. While this continues to have the caveats outlined below, it does have the benefit of cohesion with our previous methodology.
- We exclude forked repos.
- We use the aggregated history to determine ranking (though based on the table structure changes this can no longer be accomplished via a single query.)
For Stack Overflow, we simply collect the required metrics using their useful data explorer tool.
With that description out of the way, please keep in mind the other usual caveats.
- To be included in this analysis, a language must be observable within both GitHub and Stack Overflow.
- No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
- There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
- All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
- In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
- Languages that have communities based outside of Stack Overflow such as Mathematica will be under-represented on that axis. It is not possible to scale a process that measures one hundred different community sites, both because many do not have public metrics available and because measuring different community sites against one another is not statistically valid.
With that, here is the first quarter plot for 2019.
Besides the above plot, which can be difficult to parse even at full size, we offer the following numerical rankings. As will be observed, this run produced several ties which are reflected below (they are listed out here alphabetically rather than consolidated as ties because the latter approach led to misunderstandings).
As expected, there was little movement within our Tier 1 languages. Generally speaking, the top ten to twelve languages in these rankings tend to be relatively static, with changes both rare and minor in nature. While the landscape remains fantastically diverse in terms of technologies and approaches employed, including the variety of programming languages in common circulation, code written and discussion are counting metrics, and thus accretive. This makes growth for new languages tougher to come by the higher they ascend the rankings – which makes any rapid growth that much more noticeable.
Go (-1), R (-1): With TypeScript jumping into the twelfth spot on the rankings, something had to give and in part it was Go and R which dropped one spot respectively into a tie for #15. In the grand scheme of things, this is relatively meaningless as the difference between one spot and another is often superficial, particularly the further down the list a language is found. This is particularly true for the R language, which continues to demonstrate robust, near Tier 1 usage thanks to a vibrant base of analytical and data science use cases. Given the domain specific nature and comparatively narrow focus of R, its prospects probably do not include a Top 10 placement; the mid second tier is likely its ceiling. For Go, on the other hand, it is reasonable to question what its stagnation in this second tier means for the language’s future. It is highly regarded technically, and enjoys popularity across a wide variety of infrastructure projects. To date, however, it has not demonstrated an ability or inclination to follow in the footsteps of languages such as Java and expand its core use cases.
Kotlin (+8), Scala (-1), Clojure (-3), Groovy (-3): One of the primary questions we had going into this quarter’s rankings was whether or not JVM-based languages such as Clojure, Groovy and Scala could repeat the last rankings’ performance in which all three grew while newcomer Kotlin declined. We now have a clear answer to that question, and it’s no. For this quarter, at least, Kotlin grew substantially while all three of its fellow JVM-based counterparts declined. Kotlin jumped so far, in fact, that it finally broke into the Top 20 at #20 and leapfrogged Clojure (#24) and Groovy (#24) while doing so. It’s still well behind Scala (#13), but Kotlin’s growth has been second only to Swift in this history of these rankings so it will be interesting to see what lies ahead in the next run or two.
Julia: For a language that isn’t even in the Top 30, Julia continues to attract questions about its performance and future. Its growth has been more tortoise than hare, but it’s up another two spots to place #34. While there is no technical basis for comparison, it is worth noting that three years ago in our Q1 rankings TypeScript made a similar modest jump from #33 to #31. That is not to say that Julia is destined to follow in TypeScript’s footprints, of course, but rather to serve as a reminder that while it’s uncommon languages can transition quickly from periods of slow, barely measurable growth to high, sustained growth quarter after quarter.
Rust: Last on our list is Rust, which neither grew nor declined but instead held steady at #23. This may be disappointing for its more ardent fans, which include some high profile and highly accomplished technologists, but Rust’s glacial ascent is relatively unsurprising. Targeting similar if lower level workloads than Go, a language itself that has plateaued in terms of its placement amongst these rankings, Rust suffers from the limits of a lower popularity ceiling while not receiving quite the same attention that Go did as a product of Google generally and people like Rob Pike specifically. By comparison, Rust’s ascent has been much more workmanlike, winning its serious fans over one at a time. It’s also worth noting that even if Rust never gets much beyond where it is today, it’s still ranking higher than well known languages such as the aforementioned Clojure and Groovy, as well as CoffeeScript, Dart or Visual Basic. Not bad for a systems language.
Credit: My colleague Rachel Stephens evaluated the available options for extracting rankings from GitHub data, and wrote and executed the queries that are responsible for the GitHub axis in these rankings. She is also responsible for the query design and collection for the Stack Overflow data.