When we founded RedMonk in 2002, we made a conscious decision to focus on qualitative analysis at the expense of quantitative research for the simple reason that we didn’t believe there was representative data available about our core constituency, developers. Traditionally, analyst firms had worked backwards from observable metrics such as server shipments and license revenue estimations. While these numbers were effective for measuring the performance of commercial suppliers, however, they were entirely unable to assess the performance of non-commercial alternatives. The growth of free software was largely opaque to quantitative analysis, for example, visible only in the corrosive effects it had on commercial software revenue.
Over the past few years, however, we’ve begun to gradually introduce quantitative analysis into our portfolio – culminating in the fall launch of our RedMonk Analytics product. We’ve begun incorporating numbers because we believe that, for the first time, we have access to quality data from which we can reasonably infer developer behaviors. Some of that data is generated in house: this is the initial basis for RedMonk Analytics, although our system is rapidly incorporating third party data.
But there are many sources for relevant developer related data today. One such is the Hacker News dataset collected by Ronnie Roller, creator of iHackerNews.com. Consisting of 1.7M entries from the site, the dataset is an interesting snapshot of developer commentary and interests.
Our first pass through the data in November looked at programming language popularity. Since then, we have been continuing to crawl the dataset regarding other topics. This dataset is interesting not because it is representative of developers as a whole, but rather because it’s a community of technologists who are collectively ahead of the curve.
DVCS
Consider the following data we derived back in November from Ohloh regarding usage of version control systems, for example.
Subversion dominates, clearly. As do centralized repositories, generally.
On Hacker News, however, the data reflects a different distribution. Even given the caveat that this data reflects mentions rather than observed instantiations, we find the trends illuminating. Here, for example, is a chart of DVCS options:
Note the reversal of the observed trend; Git dominates Subversion, rather than vice versa. Similarly, the observed preference on Ohloh for centralized repositories over decentralized alternatives inverts.
Again, the Hacker News reflects the discussion of technologies rather than actual implementations. But given that each of the technologies is freely available, it would be a mistake to conclude that the distribution of mentions has no relationship to actual adoption.
Vendors
One of the other interesting queries was for vendor names. Because they may appear in a variety of contexts, this graph is more for curiosity’s sake than actual analysis.
Like Oracle, Microsoft’s performance in the above is likely something of an artifact because of its often controversial standing with developers, but its showing is nonetheless impressive. Other surprises were that the underperformance of VMware relative to its peers and the better than average visibility of Cisco.
Frameworks
One of the requests on Hacker News was for a look at the distribution of framework mentions. Here’s the data:
The dominance of Rails is unsurprising, as is Node.js’s strong showing – we’d expect nothing less from Node given our own internal metrics. I was mildly surprised by Grails’ poor numbers; Zend Framework’s result is likely a byproduct of the two name structure.
Operating Systems
Operating systems, meanwhile, were another mixed bag. Windows, as ever, dominated, but Ubuntu’s outperform was a mild surprise, if only because the perception exists that Hacker News can be CentOS centric. That may be true, but the data certainly doesn’t reflect it. In case you’re curious, SUSE’s position relative to Red Hat was not influenced by discussion of the Attachmate acquisition [coverage]: the dataset predates that.
NoSQL
As with distributed version control, NoSQL is a subject that typically finds a welcome audience within the Hacker News community. While conservative enterprises may express little appetite for non-relational tools, developers have been far more pragmatic. Crawling their comments on the subject, we find the following distribution of mentions by datastore.
No real surprises. Mongo is slightly more popular than I would have expected, Hadoop slightly less, but the balance of the data is consistent with our experiences in the marketplace.
Cloud Providers, or: Just How Popular is Amazon?
Very. Even heavily discounting the number of mentions of Amazon as references to its retail businesses rather than its cloud computing stack, Amazon is dominant. Also notable is Heroku’s performance: as above, this dataset predates the acquisition by Salesforce, so the frequency here is unrelated to that event.
Thus concludes this round of Hacker News analytics. If you have questions you’d like to see answered in future, leave a comment or drop me a note. If you’re a RedMonk client, your available hours can be used for custom, on demand crawls of this data. Contact us for details.
Update: By request, we have added Gentoo to the list of operating systems surveyed, Neo4J and FlockDB to the non-relational stores graph, and Joyent to the cloud providers surveyed.
Disclosure: Adobe, Apache (Cassandra, Hadoop, etc), Basho (Riak), Canonical (Ubuntu), Cisco, IBM, Membase, Microsoft (Azure, Windows, etc), Red Hat (Fedora, Makara, RHEL, etc), Salesforce.com (Force.com) and Zend (Zend Framework) are RedMonk clients while Amazon (AWS), Engine yard, Google (GAE), HP, Oracle and VMware are not currently.
Donnie Berkholz says:
December 14, 2010 at 10:33 pm
I bet the rates of change of VCS adoption on Ohloh would make an interesting graph. Or go a step further and look at accelerating or decelerating adoption… the 1st and 2nd derivatives should be easy enough to calculate if you’ve got time-based data. You might also have the data to actually look at lead times of discussion on Hacker News to actual use on Ohloh.
I’m curious about other Linux distros — obviously Gentoo, but I’ve also been hearing a lot about Mint lately.
On a minor note, I have a couple of suggestions to improve how the data are visualized: (1) you might want to consider using bar graphs instead of pie graphs because relative ratios are really hard to compare on pies; (2) I would probably sort by size instead of alphabetically, but this depends on whether people look for “their” language/framework or care more about the overall most popular ones.
sogrady says:
December 15, 2010 at 1:42 am
@Donnie Berkholz: concur. one of things i’ll be looking at over time is historical trends, both in these datasets and others. snapshots are interesting, but the temporal element gives it another dimension entirely.
i’ll try and add Gentoo and Mint tomorrow.
as for the graphs, i’m torn. i concur that basic histograms are more effective, but i decided to employ the pie charts beyond the DVCS data (where i think that’s useful) simply for variety’s sake. a bunch of graphs that look the same, i thought, might be a bit boring.
but perhaps you’re right. either way, appreciate the feedback.
Robin says:
December 15, 2010 at 8:09 am
How does OSX fare with OS mentions? Seems like an obvious candidate to include.
Donnie Berkholz says:
December 15, 2010 at 12:11 pm
By the way, could you please find some way to re-enable per-post comment subscription? This is a killer feature and the lack of it makes it really hard to have a real discussion on here.
Kevin Schmidt says:
December 15, 2010 at 12:25 pm
Was Java EE/J2EE not significant among the framework mentions or was it not included?
sogrady says:
December 15, 2010 at 12:33 pm
@Donnie Berkholz: i’ll see what i can do.
sogrady says:
December 15, 2010 at 12:35 pm
@Kevin Schmidt: J2EE is ~387 mentions, or roughly comparable with Grails in other words.
it’s not practical to search for EE, meanwhile, b/c it’s a conventional term with other meanings (e.g. Enterprise Edition).
sogrady says:
December 15, 2010 at 12:36 pm
@Robin: the searches here were focused on technologies also usable in server contexts, which is the reason OS X wasn’t included.
whatsthebeef says:
December 15, 2010 at 3:31 pm
Great post, excellent angle on data analysis.
Suprised by absense of bigtable in NoSQL, although app engines popularity is reflected in Cloud provider graph. Also no perforce in VCSs
Jeff Thompson says:
December 15, 2010 at 5:10 pm
Was perforce really not discussed? Its pretty dominant in the games industry, though i cant speak to elsewhere.
Jeremy Zawodny says:
December 15, 2010 at 3:39 pm
Very interestig! Thanks for posting.
sogrady says:
December 15, 2010 at 5:24 pm
@whatsthebeef / @Jeff Thompson: Perforce is at 251 mentions, FYI, which leaves it in last place.
Donnie Berkholz says:
December 16, 2010 at 12:21 am
I see Gentoo showed up, thanks! Interesting that it’s equal to CentOS. Perhaps that means the hype’s gone away, and we’ve finally got real users instead of trend followers (currently with Ubuntu?).
Open Sources » Turning popularity into cash says:
December 16, 2010 at 9:08 pm
[…] analyst Stephen O’Grady offers some intriguing analysis of technology trends, based on mentions in Hacker News, but one in […]
2010, the year of Ubuntu « rand($thoughts); says:
December 17, 2010 at 5:36 am
[…] another data point, RedMonk analyst Stephen O’Grady analyzed data from Hacker News consisting of 1.7 million entries. O’Grady explains: This dataset is interesting not because […]
Hoe houdbaar is het succes van Ubuntu? | says:
December 27, 2010 at 11:49 am
[…] een ander onderzoek analyseerde RedMonk-analist Stephen O’Grady de data van Hacker News, bestaande uit 1,7 miljoen […]
Hoe houdbaar is het succes van Ubuntu? | Talk About IT says:
December 29, 2010 at 1:09 am
[…] een ander onderzoek analyseerde RedMonk-analist Stephen O’Grady de data van Hacker News, bestaande uit 1,7 miljoen […]
Rethinking Ruby’s role in the cloud « rand($thoughts); says:
January 28, 2011 at 5:24 am
[…] to analysis by RedMonk’s Stephen O’Grady, the alpha geeks on Hacker News are quite interested in Ruby on Rails. As RedMonk has previously stated, the alpha geeks are typically ahead of the IT adoption curve. If […]
ehcache.net says:
March 14, 2011 at 9:36 pm
What’s Popular on Hacker News: From the Cloud to NoSQL…
When we founded RedMonk in 2002, we made a conscious decision to focus on qualitative analysis at the expense of quantitative research for the simple reason that we didn’t believe there was representative data available about our core constituency, dev…
Ubuntu, le gagnant discret du Cloud Computing « Ippon Technologies says:
October 24, 2011 at 8:39 am
[…] ici viennent d’Amazon EC2, qui est de très loin le leader du cloud privé (voir par exemple cette analyse de RedMonk). Au niveau du cloud privé, typiquement avec VMWare, les choses sont en fait très discrètes. […]