tecosystems

Black Duck and Programming Language Adoption: The Rise of JavaScript

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

Some of the questions we get most frequently at RedMonk concern programming language usage; which languages are being used, how much, and what are the respective growth/decline trajectories? Because there is no single canonical source for this data – even representative surveys are problematic – we examine as many distinct sources as we can to form a larger picture.

One of these comes from our client Black Duck, whose already significant Knowledge Base was substantially expanded by its October acquisition of Ohloh. Black Duck’s primary mission in life is digesting information about open source code, from license to language, to streamline the consumption process for enterprises. As it turns out, this data can also be used to understand developer trends. The folks from Black Duck have been kind enough to share some of the language usage data from their knowledge base, which we hope to do regularly, and which I in turn will relay to you here.

Before I proceed, two things to note.

First, the data supplied by Black Duck included the Top 13 languages usage, but I’ve filtered that down to the seven you see here. Among those filtered was what Black Duck defines as “shell,” which was one spot higher than Ruby; it was omitted because part of that volume is likely configuration and installation shell scripts, which are not what I’m interested in here. The other languages were omitted – as with C# – because their overall usage was insignificant (1.2% all time) and growth or decline were neglible.

Second, the dates selected were arbitrary for this instance, because this was an ad hoc query run at our request. Consider timing as necessary when evaluating this data.

First up is all time usage data. This represents the percent of programming language usage within the Black Duck Knowledge Base.

Black Duck Knowledge Base Language Usage (All Time)

This data contains few surprises. C (44.6%), C++ (13.3%) and Java (9.4%) are the volume languages, with JavaScript, PHP, Python, and Ruby showing more modest but still significant traction.

Next, let’s examine the usage pattern for the twelve months prior to 8.12.2009 and the year ending last Monday, 3.28.2011.

Black Duck Programming Language Usage

In the 19 months between those dates, we’re seeing an interesting shift. Note, for example, that in the twelve months trailing March 28th, JavaScript passed Java. The pattern is more apparent if we depict just the delta between the years. This represents the percentage in change from the year ending August 2009 to the same timeframe ending March 2011.

Percentage of Change in Language Usage

This data supports the view that dynamic languages like JavaScript and Ruby are gaining share, possibly at the expense of traditional enterprise languages like C++ and Java. Note the odd growth in C, however; this may be an outlier as we’ll see below. The dynamic language gains are modest relative to total volume, of course. For context, the most popular dynamic language here, JavaScript, still represents less than a fourth of the total lines of C as of last week.

Percentage of Change in Language Usage

When we compare March’s figures to the all time volume, meanwhile, the pattern is even more pronounced: dynamic languages have universally gained share, while C, C++ and Java all have declined.

Conclusions

The data here seems to validate two recent conclusions; first, that JavaScript, Python, and Ruby frameworks are experiencing growth [coverage]. Usage of a developer framework, of course, is directly correlated with use of the language itself. Second, that Java has peaked from a relative adoption standpoint but remains a volume platform, with more lines of code than Python and Ruby combined [coverage].

The data also suggests that JavaScript in particular is seeing substantial growth, with the best growth rate against the all time data set and the second best versus 2009. Growth sufficient for it to overtake Java in total volume. Ruby performs only slightly less well, with the best overall growth rate from 2009 to 2011. GitHub’s data is similar; almost four months ago, JavaScript passed Ruby as the most popular language on the site.

With the caveat, then, that the above data is simply what’s measurable by Black Duck and cannot therefore be considered representative in a strict statistical sense of developers worldwide, it may be time for you to look at JavaScript. And maybe node.js, while you’re at it [coverage].

Credit: Our thanks again to Black Duck for sharing this data.

16 comments

  1. I'm curious how much, if any effort, is made to account for the fact that the Javascript in most Python, Ruby or PHP applications is simply copies of distributed libraries. In my case personally, Github indicates that my projects are 40% Javascript. The reality though is that there's almost no original Javascript in my aplications. It's just verbatim copies of existing libraries.

    In terms of development effort I have multiple man years in the Ruby code that Github indicates is 60% of my code base and at most a several weeks in the Javascript that it indicates is 40%.

    Other languages don't benefit from that same kind of inflation because they typically link to external libraries where with the Javascript it's included as source code.

    I'm not suggesting that Javascripts not growing. I'm absolutely sure it is. I am however very suspicious of the numbers above.

  2. @Mike Greenly: that is difficult to account for, and I know the GitHub guys have tried to workaround with regular expression exclusions.

    I can’t speak for Black Duck here, but I suspect they’ll have some intelligence in terms of evaluating projects versus libraries.

    I’ll ask them, though.

  3. Interesting data. The growth in Ruby is astonishing. I bet shell-based buildsystem files could be excluded easily enough, there's only a handful of common filenames.

    I don't really understand what the dates mean in this context. Is this all code in existence on a certain date, or new code committed on a certain date, or projects that released tarballs on a certain data, or what?

    Do you have the graph with the absolute numbers side-by-side for all time vs present?

    I'd love to watch a movie of cumulative growth per year to see whether the changes are consistent or whether they bounce around year to year.

    Since you mentioned statistical validity — from a statistical point of view, taking multiple random samples of their total database and looking at the distributions across those samples would provide more robust results.

  4. Several people have been saying so for a few yars.

    Check out http://goo.gl/OBmWQ

    — MV

  5. […] looked at a couple attempts to measure it before: a Github language survey and the TIOBE Index. Now RedMonk has published some data from code management vendor Black […]

  6. […] looked at a couple attempts to measure it before: a Github language survey and the TIOBE Index. Now RedMonk has published some data from code management vendor Black […]

  7. […] looked at a couple attempts to measure it before: a Github language survey and the TIOBE Index. Now RedMonk has published some data from code management vendor Black […]

  8. […] looked at a couple attempts to measure it before: a Github language survey and the TIOBE Index. Now RedMonk has published some data from code management vendor Black […]

  9. I'm very curious about how the data was measured. Lines of code? Adjusted for language verbosity perhaps? Number of components? Did Black Duck provide any commentary on how the measurements were determined?

  10. […] looked at a couple attempts to measure it before: a Github language survey and the TIOBE Index. Now RedMonk has published some data from code management vendor Black […]

  11. […] que leen– para luego desaparecer bajo una avalancha de notas más interesantes, sucede que JavaScript superó a Java en popularidad durante […]

  12. […] più di trent’anni, a seconda che si prendano in considerazione i dati TIOBE oppure questi, C++ rimane uno dei linguaggi di programmazione universalmente più diffusi. Apple Mac OS X, […]

  13. […] più di trent’anni, a seconda che si prendano in considerazione i dati TIOBE oppure questi, C++ rimane uno dei linguaggi di programmazione universalmente più diffusi. Apple Mac OS X, […]

  14. I’am bit sceptical with the way we mesure language usage here:

    A big old project will grow slowly in LOC but may have many more comiters. This is simply because it is more time consumming to track a bug or add a feature for a software with 10 millions line of code than one with 10 thousand line of code.

    I see a real life example from my own experience. At my previous job, working for a new small project (100K LOC), a developper would produce on average something like 50 LOC of JAVA code per working day.

    Now I’am working in a big software with more than one dozen million LOC. The average is of 7 LOC/dev/day. More interresting, new developpements it more like 2 LOC/dev/day.

    Another problem, git is about open source and has been made as the source control software for the linux kernel (written in C). It is then logical that you see lot of C inside git repos. Pythons users for example tends to use more mercurial than git. This would lower python stats.

    But there is even more. Github is about open source. We don’t speak about ALL softwares. If only a few of us make open source software, most developper have a work for a living. And most of the time it is for closed source software.

    I really doubtfull that the average OSS guy will like/use the same language than the average software developper. Only a few OSS software are really used. Most are just experiment from one guy were there is no constraint of the language. With github there is even many that create a project, just to illustrate a blog post or something.

    What all of this mean? Added with other comments, I really doubt the gathered statistics has any interresting meaning other than pythons guys are more on mercurial and JAVA source code is often on closed source so not really visible here.

  15. […] di John D. Cook una serie di post sui linguaggi di programmazione: the Register; Daniel Lemire; tecosystems #1; tecosystems […]

  16. […] éste que leen– para luego desaparecer bajo una avalancha de notas más interesantes, sucede que JavaScript superó a Java en popularidad durante […]

Leave a Reply to Frattanto nella blogosfera #10 « Ok, panico Cancel reply

Your email address will not be published. Required fields are marked *