Let’s say you wanted to try and correlate taxi data, as one hedge fund reportedly did recently, with market performance. Where, precisely, would you go for that data?
Exactly.
That’s why we’re going to see, and soon, the rise of data marketplaces. It’s going to be driven by two simple realizations. First, that every well run organization needs to become more creative in its usage of data analytics, because the margins are slimmer and the competition wider. Second, that just about every organization on the planet is carrying a moderately to heavily underutilized asset on its balance sheet in its data. Yes, even your organization.
The solution to both of those problems, quite obviously, lies in facitilated connections between buyers and sellers. Which means marketplaces. Marketplaces like Infochimps.
The site run by Flip Kormer and Joe Kelly promises to “Find Any Dataset in the World,” and is – to my way thinking – the shape of things to come. Currently hosting 5648 datasets, ranging from “Price Indexes for Personal Consumption Expenditures” to “100,000+ official crossword words” to “Resident Population by Race, Hispanic-Origin Status, and Age” to “Retrosheet: Major League Baseball Awards and Honors,” Infochimps is positioning itself as an independent marketplace for data owners to market and sell their datasets. Thanks to Josh of Jones-Dilworth, I spoke to the founders last week and found their ideas of the opportunity ahead similar. And they’re just great guys, to boot.
There are already information marketplaces, to be sure. SAP, as James passed along, sells data via its Information OnDemand store. And Acxiom and the credit bureaus have been trafficing in your details since the 60’s, according to Wikipedia.
But as the tools of data analytics are being democratized, as I mentioned yesterday, so too will the means of data acquisition. Infochimps illustrates this perfectly; if anything’s, it’s too democratic. The licensing – because it’s set by the user, are all over the place – and there’s not much in the way of client integration.
Consider, however, what becomes possible if they were able to build a bridge between Infochimps and something like IBM’s ManyEyes. What would be the difference between this visualization (Java required for both, sorry) of Brunswick temperatures I did a while back with data that I ferretted out from NOAA’s FTP servers:
and this one, which I did a few minutes ago using Austin weather data downloaded from Infochimps?
Nothing except for the time it took to find the data. Well, that and the fact that one’s Brunswick, ME and one’s Austin, TX.
The importance of data marketplaces is less, though, for data like weather that can be retrieved, albeit with some difficulty, than it is for data that’s currently just not made available, period. The success of marketplaces will depend initially on staples like weather and financial data, but like mobile app stores, broaden significantly as critical mass is gained. Eventually we may even see co-op type sales; while the data from one small business, for example, isn’t likely to be that compelling, the information from 500 of them would be. Marketplaces should eventually be able to facilitate such sales.
In the meantime, I’d settle for one place to get my baseball data, because I’m tired of pulling it down manually from sites like this.
Postscript: Many of you probably read the above and are screaming “you’re going to sell my data??? what about my privacy!?!” Which is a fair question, and as a result one that I’ll tackle shortly.