Home ThisWeKnow: New Semantic Web App Tames Massive Data Sets from Data.gov

ThisWeKnow: New Semantic Web App Tames Massive Data Sets from Data.gov

Data.gov launched in May this year to make huge data sets of information from federal agencies available in machine-readable formats. While incredibly valuable, these data sets are not particularly useful in their current format to anyone but researchers, statisticians, sociologists, developers, or others used to parsing databases searching for trends.

At least for geographically relevant information, ThisWeKnow provides one use case for the data sets. Users can enter the name or ZIP code of any community and get details on all kinds of factors, from violent crime to companies releasing pollutants.

Each search for a location will generate a page of “factoids,” as ThisWeKnow calls them. Single sentences that express statistics about the community, these factoids can be clicked for deeper digging, as seen below, or can be shared on Twitter by clicking the “tweet this” links on the page.

Developed by a consortium of three different organizations (web app shop and data analysis firm GreenRiver.org, web design studio Sway Design, and semantic web database company Intellidimension), ThisWeKnow is written in Ruby on Rails. It communicates via SPARQL to an RDF database. The source code is available under an MIT license at GitHub. Users can also see the SPARQL query that generated the information on any particular page of the site.

Out of the box, ThisWeKnow presents interesting information; however, we are interested to see how the developers proceed to offer more options for sorting, comparing, and visualizing the available data.

GreenRiver.org managing director Michael Knapp addressed our desire for more granular data in an email, saying, “The presentation of these data at the town level was somewhat arbitrary – we figured it would be more recognizable to end users than block groups, etc. We needed to combine numerous data sets which present data at very different spatial aggregations, and of the 9 or 10 databases we’ve loaded, only one used coordinate data… Our vision is to have numerous facets into these data, including time (history), issues, etc., in addition to place-based ‘factoids’.”

In its first phase of development, the ThisWeKnow team has focused on a handful of spatially focused data sets from six different agencies in the Data.gov catalog. Ultimately, they hope to make the entire Data.gov catalog available to the public and give developers an API to access the data, as well.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the gambling and blockchain industries for major developments, new product and brand launches, game releases and other newsworthy events. Editors assign relevant stories to in-house staff writers with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest iGaming headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Gambling News

    Explore the latest in online gambling with our curated updates. We cut through the noise to deliver concise, relevant insights, keeping you informed about the ever-changing world of iGaming and its most important trends.

    In-Depth Strategy Guides

    Elevate your game with tailored strategies for sports betting, table games, slots, and poker. Learn how to maximize bonuses, refine your tactics, and boost your chances to beat the house.

    Unbiased Expert Reviews

    Honest and transparent reviews of sportsbooks, casinos and poker rooms crafted through industry expertise and in-depth analysis. Delve into intricacies, get the best bonus deals, and stay ahead with our trustworthy guides.