The long awaited catalog of public data from the US government launched this morning at Data.gov. Developers, watchdogs and data nerds around the world rejoiced – but the initial offering is a bit of a let down.
New federal CIO Vivek Kundra is in charge of the site, which will act as a central repository for government data, including XML, CSV, KML files and more. At launch a mere 47 data sets are included and they appear to lean towards the least controversial matters. None the less, it’s exciting to see the effort happening. Hopefully some awesome mashups are on the way!
There are many, many sets of data available from the federal government but the Data.gov site says it was selective about quality and standards when choosing what to include. It’s hard not to compare other sources of government data and feel disappointed, though. The privately built USGovXML.com contains far more data and was built by one independent developer over four months. That site lists ten Department of Interior XML feeds, for example, none of which appear on Data.gov. You can find a feed of food recalls there, but not on Data.gov.
Twenty six government agencies are represented in the catalog, though not all are offering raw data. The FBI is listed as a source but only offers a widget that can be placed on websites, not access to raw data.
New York Times data wonk Derek Willis pointed out that the initial offerings are non-controversial. “Most are from USGS, EPA and National Weather Service,” Willis observed this morning. “No [data from] Department of Homeland Security, State or DOJ.”
Likewise, a search of the data sets for keywords like food, prisons and drug all bring up zero results. Those are examples of particularly important topics because they are matters of justice and injustice – shedding light into dark corners where injustices are being perpetrated is one of the most important things that government data and the subsequent computer assisted reporting can accomplish.
There are no RSS feeds available for the whole catalog or search queries, something that would be very useful for tracking additions of new data. We expect that will change soon.
People will no doubt argue that some data is much better than no data, and while that’s true: for a new federal office to engage with such an important topic with the weight of history and the whole administration behind it and then come up with something this limited is disappointing.
API and mashup watcher John Musser of ProgrammableWeb was more generous than we are about the initial offerings:
“They’re off to an excellent start. It’s a big step in accessibility of government data. As we’ve been seeing with other v1 gov-data efforts, like the recently available data on senate votes: step one is give people structured data like xml, step two (or later) is to make it available via an API. They have a healthy amount of metadata. The number of data sets is not that large, but of course it’s just the beginning.”
It is just the beginning and we applaud the launch of this effort. We hope that the initial launch will pale in comparison to the long term value of this collection of data.
The folks at Sunlight Labs, Google, O’Reilly/TechWeb and Craig Newmark just launched a new part of their Apps for America contest to build the best mashups and data visualization tools for data in the new Data.gov site. Check it out!
See also the newly launched Whitehouse.gov/open – launches today just keep popping up.