Blogs

RedMonk

Skip to content

Are we getting better at designing programming languages?

In the aftermath of my earlier work on the expressiveness of programming languages, I started wondering whether our ability to design and choose optimal languages might have improved since the ’50s when Fortran and Lisp were invented. To answer this question, I returned to my original data on the expressiveness of the top two language tiers by our ranking. By mapping the medians against the years each language was invented, I created the below plot (click to enlarge):

Shown is the median weighted expressiveness of each language (as calculated in the previous post) plotted against the year each language was invented or first published.

Shown is the median weighted expressiveness of each language (as calculated in the previous post) plotted against the year each language was invented or first published.

I’ve broken this plot into four clusters. The first two, in gray, indicate overall low vs high expressiveness (gray). The third, red cluster shows all new, popular languages from the past decade, while the empty cluster (“The Gap”) shows the complete lack of less expressive new, popular languages. Each language is colored in red (tier one) or black (tier two) to make it easier to assess where the most highly used languages fall.

Caveats? The meaning of expressiveness as measured by this metric is deep and complex. As mentioned in the initial post on the topic, JavaScript appears to be an artifact due to its unusual development norms. And if you look closely, perhaps a language here and there doesn’t fall quite where you’d expect. But overall, things largely make sense, and this is borne out by correlations between my measurement of expressiveness and developer surveys.

What can we get out of this?

Broadly, we’ve gotten better at designing or choosing expressive languages. In both the low-level and high-level clusters, a slight to moderate downhill slope is clearly visible. Because I haven’t shown all new languages but only popular ones, it’s difficult to ascertain whether our design skills have improved or whether it’s a larger crowd effect of natural selection.

Tier-one languages are in both high- and low-expressiveness clusters. Interestingly, however, most of the ones in the high-level cluster got much slower uptake than in the low-level cluster. For example, compare Java and C# vs Ruby and Python. While you could argue about their relative popularity today, I don’t think you can make a viable argument that Ruby and Python got faster uptake from a large population after their initial releases.

An exceedingly small number of languages have strong staying power. Of today’s tier-one languages, only C and shell script are more than 30 years old, while the remainder are all 20–30 or 10–20 years old. The next-closest example, Objective C, should nearly be disqualified because it’s only tier one due to the recent (on this timescale) popularity of iOS.

In the past 10 years, no very popular, lower-expressiveness languages have shown up. None of the tier-one languages are less than a decade old. I think this says something about the time it takes to gain significant adoption, even in the predictive communities we rely upon for our popularity analysis (GitHub and Stack Overflow, in this case). Of the ones in the red cluster, it wouldn’t surprise me to see Go, CoffeeScript, or even Clojure show up on the tier-one list in time, based on the changes in our rankings.

All in all, it’s very suggestive that something important has changed in the past decade regarding how developers design, evaluate, and select programming languages. It could be reflective of the shift toward developer empowerment rather than management handing down decrees from on high regarding which technical choices get made.

by-sa

Categories: adoption, data-science, programming-languages.