We hear about the digital reams of data our modern civilization produces, usually measured in terms of Libraries of Congress. But the Library of Alexandria might be a better yardstick—because most of that data is tossed, destroyed, lost to history.
At New Relic’s FutureStack conference in San Francisco, Lew Cirne, the software company’s CEO and founder, took a small step against this digital biblioclasm by announcing that his company would now store 8 days of event data from the applications it monitors, up from 24 hours.
If you’ve tracked the plummeting costs of cloud computing, you know what this means: Next year, it will be six months of data. Soon, companies like New Relic might be promising a decade of recall for a low monthly fee. Before long, forever might be the standard, achievable at a trivial cost.
The Library Of Everything
This is an advance made possible by technology—specifically, the ubiquity of the cloud, and advances in database technology.
Until recently, Cirne argued, “you couldn’t capture everything all the time. It was too much data to capture on premise”—to store locally, on a company’s own servers. Barbaric, like keeping just a single copy of a philosopher’s treatise in one grand building.
“We didn’t collect all the individual data points, because it was just too much,” Cirne said. “You need to collect it all. That is an enormous amount of data.”
In New Relic’s case, across all the customers who have written its software into their applications, that’s 18 billion events a day, he added.
So that’s a lot of data. Not as consequential, perhaps, as the lost works of Hipparchus, but data that might otherwise be discarded.
Our Data, Our Lives
What can we find in these otherwise ephemeral bit trails, once we start keeping them? Cirne suggests we might find an exact record of an e-commerce transaction, with tell-tale records of why it took hundreds of extra milliseconds to serve a Web page.
That’s a solidly prosaic reason. No doubt the Library of Alexandria contained now-lost records of shipments of grain and ledgers for royal treasuries.
But I like to think in Cirne’s vision of boundless data there’s a hint of something more important: the idea that we might one day be able to reconstruct the digital flotsam of our daily lives, the ephemera we now discard.
New Relic’s quest to store all of our application logs is of a piece with, say, Google Photos’ push to be an unfillable digital shoebox, or Amazon’s urge to list everything that might be sold.
And the Internet of Things promises to explode the amount of data we generate a thousandfold.
If we just discard that data—or index it, or aggregate it, or otherwise condense and simplify it, which are equal sins in Cirne’s eyes—we won’t be able to calculate the loss.
Keep all the data! Why not?
Photo by Owen Thomas for ReadWrite