announced its first foray into making public data searchable and viewable in graph form. The company is starting with population and unemployment data from around the US but promises to make far more data sets searchable in the future. The potential significance of making aggregate data about our world easy to visualize, cross reference and compare can't be overstated.Google just
Most of us understand the world based on stories we've put together from our own lived experience. Another way to understand things is by finding patterns drawn from everyone's experience in aggregate. Journalists often find big patterns and then zoom in to particular life stories that exemplify those general trends but make them easier for us to relate to as individuals. Those stories then help move public opinion in favor of policies that aim to change the general trends. That's just one way that easily searchable public data can be very, very important.
These first data sets come from the U.S. Bureau of Labor Statistics and the U.S. Census Bureau's Population Division, but as Google explains in its announcement there are far more sources of information that could be included. Those two government agencies alone have a lot more to offer as well.
We hope that Google will index as many public data sets as possible. We'd like to see demographic data like race and income made available for cross referencing; infant mortality, education levels, toxic waste reporting and crime statistics are other logical factors that would be great to see included.
It may not be a co-incidence that the new Google Public Data search option was announced on the same day that the much-anticipated Wolfram|Alpha data-centric "expert knowledge" engine was first demonstrated to the public.
The coming era of the web is based on data, on drawing patterns and meaning out of a far larger body of data than the human mind alone could ever comprehend. The explosion of data (much of which is now created by the people formerly known as the audience), combined with commodity level storage and processing power, makes technology like what Google began to unveil today possible and important.
Google made its reputation by showing people the most important web pages on any topic. In the future, search engines will grow in importance as they become more capable of showing us what is most important across all web pages and all other available data, about any given topic. That's why we find the wide open conversations and social connections on Twitter so interesting, why we argue that the real motherlode of value in Facebook is not just individual streams of data but open access to all the data for analysis, and why we're so intrigued to see Google enter this space.
The availability of census and other public data has helped illuminate a wide variety of issues through "computer assisted reporting" - from the redlining of housing loans along racial lines to very current studies of ongoing urban segregation.
Just like blogging democratized publishing, we hope that Google and other services will make enough data sets available for anyone to cross reference and visualize that analysis of public data will also become something that anyone can do. That means that a whole lot more of it will be done.