When Twitterannounced last week that every public tweet since its inception in 2006 would be archived in the Library of Congress, many people were excited.
“The Twitter digital archive has extraordinary potential for research into our contemporary way of life,” says James Billington, Librarian of Congress. “Anyone who wants to understand how an ever-broadening public is using social media to engage in an ongoing debate regarding social and cultural issues will have need of this material.”
Developing the Methods to Curate Twitter
There is little doubt that the opportunity for scholarship is immense – for cultural anthropologists, for historians of technology, and for academics in any number of fields. But some scholars are uncertain as to whether the resource will live up to the potential. With estimates of over 50 million tweets per day, the Library of Congress archives will contain a massive amount of data.
“A MySQL dump from the Twitter database doesn’t make an archive,” says digital historian Tom Scheinfeldt. Scheinfeldt and other scholars agree that the move could be “tremendously useful,” it will only be so if the proper tools and methodology are developed.
Scholars are faced with the challenge of designing and building the curatorial tools for evaluating the data in the Twitter archives. But how will you be able to isolate a single conversation? How can you isolate the social graph of those involved? What sorts of API will be developed, both for internal and for external research? And while addition of annotations to Twitter will likely help for tracking future tweets, similar tools still need to be devised for archived data.
Is There Commitment to Digital Scholarship?
The donation of the Twitter archive seems like a great gesture. However, it remains to be seen if the preservation of social media information, including Twitter, will be a priority, both for the government and the technology industry.
Although the Library of Congress and the National Archives have been committed to digital archiving for a number of years, programs like the Digital Preservation Program, have been historically underfunded.
As historian Scheinfeldt notes, the announcement of the Library of Congress’s acquisition of the Twitter archives is really just “the beginning of the story.” Scholars like Scheinfeldt hope to be an active voice in shaping how the rest of the story plays out.