In one of his periodic “mailbags”, in which he addresses readers’ email (we need to start doing that, I think) in a batch process, the Boston Globe’s Red Sox beat reporter Gordon Edes lamented the lack of interest in some quarters for “gamers” – written descriptions of game actions that run in the paper the following day – saying, “There is a school of thought out there that people don’t read gamers anymore.” The comment caught my eye not because I’m one of the folks he’s talking about – I haven’t read a game recap since October of 2004, I’d guess – but rather because I know precisely why I don’t read them: I have the boxscore.
For those of you unfamiliar with baseball generally, and the boxscore more specifically, the X-Files’ Mulder described it thusly:
I’m reading the box scores, Scully. You’d like it. It’s like the Pythagorean Theorem for jocks. It distills all the chaos and action of any game in the history of all baseball games into one tiny, perfect, rectangular sequence of numbers. I can look at this box and I can recreate exactly what happened on some sunny summer day back in 1947. It’s like the numbers talk to me, they comfort me. They tell me that even though lots of things can change some things do remain the same.
I’ve found boxscores fascinating for a long time, because they allow fans to follow games in fairly minute detail even if they’re not able to view them in person. You don’t get everything, to be sure, but they’re a relatively complete distillation of the action in a single baseball game. One of my old foremen when I worked construction in high school and college, a fine Irishman by the name of Anthony Crawford from Newry (right up the road from Rostrevor, where I stayed in an abbey for a few days a long time ago), said that boxscores were the one thing he preferred about baseball over his favorite sport, football (soccer, as we would term it). There’s just no equivalent in football, or most other sports for that matter, he used to tell me; the game lends itself to ready consumption and later dissection in a truly marvelous fashion.
Thus it is that every morning for better than six months of the year, I turn not to a reporter’s interpretation of the game’s events, but the quantitative reflection of the same: the boxscore. I might supplement that, of course by watching highlights on SportsCenter, listening to interviews of the players or managers, and so on – but the boxscore is the must have. It’s almost as if the action of the game was distilled down into some perfect, tiny, programmatically manipulable format. Sort of a microformat, in other words.
What does the boxscore have to with the future of journalism? More than you might think, I’d argue. Back at OSCON, as I wrote up here, The Washington Post’s Adrian Holovaty talked about a similar problem. And just a few days ago, he wrote it up in more detail here. The most important bit, in my view, is this:
One of those important shifts is: Newspapers need to stop the story-centric worldview.
Conditioned by decades of an established style of journalism, newspaper journalists tend to see their primary role thusly:
1. Collect information
2. Write a newspaper story
The problem here is that, for many types of news and information, newspaper stories don’t cut it anymore.
So much of what local journalists collect day-to-day is structured information: the type of information that can be sliced-and-diced, in an automated fashion, by computers. Yet the information gets distilled into a big blob of text — a newspaper story — that has no chance of being repurposed.
Let me clarify. I don’t mean “Display a newspaper story on a cell phone.” I don’t mean “Display a newspaper story in RSS.” I don’t mean “Display a newspaper story on my PDA.” Those are fine goals, but they’re examples of changing the format, not the information itself. Repurposing and aggregating information is a different story, and it requires the information to be stored atomically — and in machine-readable format.
His colleague at the Post, Derek Willis, documents some of the types of information that would be – in a perfect world – structured over on The Scoop.
The overriding theme here, I’d hope, is obvious. The best (only?) approach to extracting long term value from captured data is storage in a fashion that makes it easily parsable over the longer term. Whether the data in question the events of a baseball game or the exponentially more complex records that populate the Votes Database, it all comes down to structure and format. Prose, after all, is only slightly less opaque to programmatic approaches than audio or video.
In the entry linked to above, Adrian notes some basic structural opportunities in typical newspaper features:
- An obituary is about a person, involves dates and funeral homes.
- A wedding announcement is about a couple, with a wedding date, engagement date, bride hometown, groom hometown and various other happy, flowery pieces of information.
- A birth has parents, a child (or children) and a date.
- A college graduate has a home state, a home town, a degree, a major and graduation year.
- An Onion-style “On the Street” feature has respondents, answers and a publication date.
- A drink special has a day of the week and is offered at a bar.
- The schedule of the U.S. Congress has a day and multiple agenda items.
- A political advertisement has a candidate, a state, a political party, multiple issues, characters, cues, music and more.
- Every Senate, House and Governor race in the U.S. has location, analysis, demographic information, previous election results, campaign-finance information and more.
- Every known detainee at Guantanamo Bay has an approximate age, birthplace, formal charges and more.
Put differently, what these features really need is their own version of a boxscore. A format that can distill out the particulars from the body of the work. What technologies you might need to do that is a question best left for another time, but suffice to say that I agree with ric when he injected microformats into the conversation, and I also think that we’ll begin to see some creativity on the storage engine side.
In the meantime, just let me add that I think that Adrian, Derek and the gang are as important to the future of journalism as Craig Newmark.