In response to a Wired article that ran yesterday, Google is fixing its archives of Usenet posts, one of the richest and oldest repositories of user-generated content ever to exist online.
For those of you under the age of 30, Usenet began in 1979 in Chapel Hill as a collection of newsgroups. In the years that followed, Internet history unfolded, jargon was coined, and lore was created in these discussions. In 2001, Google acquired two Usenet archives comprising 700 million posts and failed to index them in any meaningful way. As of today, that wrong is being righted.
In the past, searching Usenet posts archived in Google Groups often yielded few or no results. For example, this recent discussion thread is all about the brokenness of Google’s Usenet archives and search capabilities.
“None of my posts are showing up (using advanced search, trying email and name in the author field, even limiting the date range to the right years),” wrote one user.
Noting that Google’s Usenet search “often… returns no results for queries which obviously shouldn’t,” another user said, “You just have to cross your fingers and hope that they [Google] notice the problem themselves and fix it.”
Fortunately, after media attention and user complaints, the search giant has responded and rectified the situation.
Today, Google rep Victoria Katsarou told Wired, “It turns out there was a bug, a specific bug, that affected search within a specific group. That bug is something we’re working on fixing, and I think that will be fixed by tomorrow. Thanks for writing this, because that’s how we discovered this specific bug.”
Just one bug wrecking search results for archives spanning 700 million posts and more than 20 years of data? Seems hardly likely.
Search results are particularly buggy when users filter them by date. As an example, searching alt.usenet.kooks for “godwin” produces 6,520 results. Until we tried to look at results sorted by date. Once that happened, we got 93 results. And searching alt.comp.freeware for “MS-DOS” yielded no results after 2000, even though we eventually found posts dating back to 1995 when we browsed without narrowing the dates.
If complaints by Internet old-timers, Slashdot threads and detailed email exchanges aren’t enough to get Google to tend this garden of information and ensure it is searchable, and if media attention is really what it takes, then we must add our voices to those at Wired in asking Google to keep Usenet useful. And we ask that like-minded individuals do the same in the comments.
For a nice Usenet history lesson in timeline form, check out Google’s highlights of Usenet posts dating back as far as 1981. Of particular interest to us Web geeks at ReadWriteWeb is Tim Berners-Lee’s announcement of the World Wide Web.