A month away from our third quarter run, it’s probably about time for us to post our first quarter language rankings. We’ve been busy at RedMonk with our first group travel week in three years, a slew of client work and the planning of a new RedMonk offering you’ll hear more about shortly. But we dutifully collected the language ranking data back in January and have finally carved out the time to share it with all of you.
In the meantime, as a reminder, this work is a continuation of the work originally performed by Drew Conway and John Myles White late in 2010. While the specific means of collection has changed, the basic process remains the same: we extract language rankings from GitHub and Stack Overflow, and combine them for a ranking that attempts to reflect both code (GitHub) and discussion (Stack Overflow) traction. The idea is not to offer a statistically valid representation of current usage, but rather to correlate language discussion and usage in an effort to extract insights into potential future adoption trends.
Our Current Process
The data source used for the GitHub portion of the analysis is the GitHub Archive. We query languages by pull request in a manner similar to the one GitHub used to assemble the State of the Octoverse. Our query is designed to be as comparable as possible to the previous process.
- Language is based on the base repository language. While this continues to have the caveats outlined below, it does have the benefit of cohesion with our previous methodology.
- We exclude forked repos.
- We use the aggregated history to determine ranking (though based on the table structure changes this can no longer be accomplished via a single query.)
- For Stack Overflow, we simply collect the required metrics using their useful data explorer tool.
With that description out of the way, please keep in mind the other usual caveats.
- To be included in this analysis, a language must be observable within both GitHub and Stack Overflow. If a given language is not present in this analysis, that’s why.
- No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
- There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
- All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
- In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
- Languages that have communities based outside of Stack Overflow such as Mathematica will be under-represented on that axis. It is not possible to scale a process that measures one hundred different community sites, both because many do not have public metrics available and because measuring different community sites against one another is not statistically valid.
With that, here is the first quarter plot for 2023.
The dominant trend, again, is the lack of movement. While the industry around these programming languages is evolving rapidly, the inertia of language traction has proven difficult to overcome. Certainly this method of measurement rewards incumbent languages as it is by design accretive in nature, but we see little evidence elsewhere to suggest a counter-narrative of rapid ascents and descents of programming languages in general.
One potential wildcard in programming language movement that has emerged is the explosion in usage of LLM-based tooling. Obviously we’re not seeing any major shifts due to these tools as yet, but it is plausible that they could begin to have an impact. Some of that could arrive via the fact that LLMs, at this point, are better trained on some languages than others, which could tilt users of these in favor of one language at the expense of another. It is also possible, however, that the ability of LLM tooling to rapidly educate and train users on new and unfamiliar technologies like new programming languages could lower the barriers of entry to new languages, and thus encourage broader rather than narrower language employment.
Whatever the impacts might be, however, none have been observed yet in the data, and it’s only a hypothesis in any case. Which leaves the status quo unchanged, and the status quo is of an increasingly static language landscape. It is static to the point, indeed, that internal discussions are underway about the possibility of shifting from bi-annual to annual language rankings, because there just hasn’t been much change to be tracked otherwise. We’ll likely take a look at our next run of data before we make any final determinations, however, just as we will explore other measurement models which might provide a more real time view.
With that, some results of note:
Ballerina: a five year old open source language designed by WS02 to be used combining services in cloud environments, Ballerina made its first appearance a few runs back and has held steady in the late 80’s, typically – 87 in our last run, 89 in this quarter’s. It’s in circulation, in other words, but hasn’t yet achieved the usage velocity that would jump it up to be competitive with even configuration oriented languages like HCL (45) or Puppet (39). Language growth is a difficult task to achieve in the best of times, and with such a crowded landscape of options at this point it’s far from the best of times, so resources will have to be applied to change the current trajectory.
Clojure: present in the back half of our top 20 from the years 2014 to 2017, Clojure has now slipped down to 27 – which admittedly is outperforming fellow one time high fliers Visual Basic at 30 and CoffeeScript at 31. A Lisp dialect that was one of the Groovy/JRuby/etc JVM-based Java alternatives, Clojure has steadily lost ground in these rankings, as have most of its counterparts with the exception of Scala. It’s fair to wonder how much the rapid growth of Kotlin has impacted these, because in addition to competing with other languages, the JVM-based cohort is competing for space on top of that specific platform.
Dart/Kotlin/Rust (0): speaking of the growth of Kotlin, it – along with Dart and Rust – have been notable for their lack of recent growth. All surged up into the back end of our top 20, only to more or less stall out there. As noted in the introduction to this section, growth is difficult to achieve broadly speaking and becomes only more difficult as languages progress towards the top of the rankings. It will be interesting to see if any of these three can achieve separation from the other two and resume its march upwards. And before someone helpfully points out that these languages are quite distinct functionally and not intended for the same workloads, let me note that they are grouped here strictly because of their similar trajectories within the rankings – not because we expect Dart, say, and Rust to be competing for workloads.
Go (1): Go jumped a spot during our last rankings and the question was whether it had any more growth in it and it did, jumping one spot for the second time in a row. It’s a modest bump, to be sure, but given the recent fast growth of back end languages like Kotlin and Rust there were some questions as to whether or not Go would begin to lose traction. Instead, it’s sustained its position and even rebounded somewhat, a promising development for Go advocates and users. As with the aforementioned Kotlin and Rust, however, it remains to be seen whether the 14ish range represents a plateau from which Go cannot ascend further or whether it’s a ladder to further growth.
Objective C (-4): as we asked whether Go’s bump last quarter was temporary, so was too the question asked about Objective C which saw a rare jump for the language. The answer now is clear: yes, the bump was temporary. In January’s run, Objective C dropped four spots – a large decline for rankings in which movement tends to be more deliberate. As has been covered before, however, outside of its large established codebase Objective C doesn’t have much that argues for growth potential. Its biggest proponent in Apple has anointed the more syntactically friendly Swift as its successor and replacement, and it lacks other large workloads to tackle. Still, there’s a large body of Objective C code that’s not going anywhere, which means that Objective C’s decline should be gradual.
Credit: My colleague Rachel Stephens wrote the queries that are responsible for the GitHub axis in these rankings. She is also responsible for the query design for the Stack Overflow data.