Blogs

RedMonk

Skip to content

They Say The Pacific Has No Memory: Well Neither Do Facebook, Twitter or Your iPhone

You know what the Mexicans say about the Pacific? They say it has no memory.”

It’s easy to understand why Andy Dufresne, wrongfully imprisoned protagonist of the Shawshank Redemption, would value a place with no memory, no history. What’s less obvious, at least to me, is why we all feel that way.

Because clearly we must. Facebook, after all, has in recent days passed first the 400 million user mark and then Yahoo. And while the tool is exquisitely well designed to help us share the present, as far as the world’s most popular social network is concerned, the past may as well have never happened. What were you doing this time last year? Good luck figuring that one out; the UI certainly isn’t going to help you.

Which is not to single out Facebook: Twitter is no better. I can’t find an answer to the question of how far back Twitter lets you browse in their API documentation, but I’ve seen the number 3200 claimed a few times. If that’s true, about 66% of my Twitter history is non-visible to me, the author. And while my iPhone dutifully backs up my SMS history, it does not – at least as far as I can tell – expose it to me. The closest thing is knowing where they’re stored on the file system.

Yes, there are workarounds for all of the above: piping feeds into backup services like BackupMyTweets or YouArchiveIt, using one off tools to extract and store your data, and so on. But how many users do you think will be able to find and successfully use those? More to the point: what are they going to do with raw backups? How will they search it, look it up by date, or drop it in a calendar?

They won’t, in all likelihood. It will be just like it is today: as if the past had never happened.

For some, that might be a good thing. For others, it won’t matter, because the content is of marginal value or less. But for a subset of users and a subset of content, the unavailability is not a boon. Wouldn’t it be nice, for example, to look back on all the congratulations you received when you had a child? Started your new job? Got married? Or just had a birthday? More interestingly, might not that content have latent, unrealized value? Isn’t it possible, for example, that you could do sentiment analysis on your Twitter stream and have a more realistic and objective look at fluctuations in your mood and outlook, day to day, month to month, or year to year? Might you be able to mine it for undiscovered patterns of behavior? Imagine being able to browse your Facebook, Twitter and SMS history via just a simple calendar, the way you can FourSquare. Are the privacy issues? You bet. But the alternative – privacy through simple loss of data – is no more attractive.

There’s a reason that some very smart people are interested in “logging” more and more aspects of their lives: the more data you have, the more meaningful the conclusions you can extract. Unfortunately, Facebook, Twitter et al are just trying to keep their heads above water at this point – four months ago, Facebook was adding 24-25 terabytes per day – so returning our data to us period, let alone making it useful and meaningful, just isn’t a priority for them right now. In fact, we can probably recreate the Civil War more accurately from correspondence than we can the recent events of our lives.

But it should be more of a priority for us, and for those building applications for us. Because living strictly in the present at the expense of the past rarely does anyone any good. Just ask George Santayana.

Categories: Analytics, Data, Privacy.

  • A in NL

    It seems to me that this has always been the state of the web. A new version wipes out the old one, which is then gone forever to the public at large.

    Of course there have always been localized efforts to fight this. The wayback machine is probably the most centralized of these, and then there are individual backups and repositories (such as my collection of every one of my website incarnations), but these are just a drop in the bucket compared to all the information which has been created and which is now lost.

    Is this a bad thing? I’m a data packrat, but somehow the idea of missing my tweet history doesn’t disturb me when I think of the information which is already lost to us in the hundreds of years of recorded history and earlier. I’d probably rather have a fraction of a scroll from the Alexandria library than all of Twitter. We shouldn’t forget the past, but we also shouldn’t try to hang on to everything…and we should put what’s being lost into perspective.

  • http://ox.io juergen geck

    I like your observation about the necessity of the software as much as the data for either to make sense for the user. As much as I admire the efforts of Google’s data liberation front: I think it is useless. Because I don’t have a googleplex to digest that data.

    What we do with Social Open-Xchange is to treat Twitter as yet another messaging feed, just like Facebook status updates.

    In about a week from now you can try it out at http://ox.io … not to say you should wait to get an account :)

  • http://dkretzmann.blogspot.com Doug K

    “we can probably recreate the Civil War more accurately from correspondence than we can the recent events of our lives. ”

    quite so. A decade ago my literary friends were talking about email’s effect on biography. Letters have a physical avatar which can be tracked. Emails are ethereal; how much more so are tweets and facebook snippets.

  • http://dkretzmann.blogspot.com Doug K

    to follow on from my last, a stolen letter from Descartes is changing perceptions of his philosophy.
    http://chronicle.com/article/Key-Letter-by-Descartes-Lost/64369/

    Who will find the stolen tweets of let us say @timoreilly, in three centuries’ time ?

  • http://voodoowarez.com rektide

    “…piping feeds into backup services like BackupMyTweets or YouArchiveIt, using one off tools to extract and store your data, and so on. But how many users do you think will be able to find and successfully use those? More to the point: what are they going to do with raw backups? How will they search it, look it up by date, or drop it in a calendar?

    They won’t, in all likelihood. It will be just like it is today: as if the past had never happened.”

    Axiom 0A of the web: “Any resource of significance should be given a URI.” Without a URI, the resource does not exist, may as well not be there; anything not identifiable and named is not a thing, is not on the web. So yes, I agree, these hacked up backups are not really sufficient. If we’re adding terabytes of content, by george, it better be able to be a part of the web. Even if the data has gone 404, the axiomatics of the web tell us we should at least be able to consistently reference that data.

    This post seems to be focused on our personal memories, about lifestreams or activitystreams on facebook and twitter and the myopic weakness of these implementations. To that end, I’d cite Storytlr, a web-log aggregator. Storytlr helps you generate a cohesive or personalized feed out of the array of online activity streams out there (facebook, twitter, delicious, &c &c). Presumably any of the source feeds it pulls from will be referenced by your Storytlr feed; even if twitter forgets about your post, the Storytlr feed has captured it, in a web-standards compliant manner.