Having data available electronically is not the same thing as the data being useful. Campaign finance disclosures provided electronically by the Federal Elections Commission (FEC), are a good example of that. The New York Times‘s Fech (not “fetch”) is a RubyGem – a packaged application – designed to help journalists and public interest organizations access and make sense of FEC filings.
Here’s the NY Times’ description of Fech from its first release last year:
Journalists who work with these filings need to extract their data from complex text files that can reach hundreds of megabytes. Turning a new set into usable data involves using the F.E.C.’s data dictionaries to match all the fields to their positions in the data. But the available fields have changed over time, and subsequent versions don’t always match up. For example, finding a committee’s total operating expenses in version 7 means knowing to look in column 52 of the “F3P” line. It used to be found at column 50 in version 6, and at column 44 in version 5. To make this process faster, my co-intern Evan Carmi and I created a library to do that matching automatically.
Fech (think “F.E.C.h,” say “fetch”), is a Ruby gem that abstracts away any need to map data points to their meanings by hand. When you give Fech a filing, it checks to see which version of the F.E.C.’s software generated it. Then, when you ask for a field like “total operating expenses,” Fech knows how to retrieve the proper value, no matter where in the filing that particular software version stores it.
Derek Willis of the NY Times announced the 1.0 release of Fech last month. This release covers “all of the current form types that candidates and committees submit.” Perhaps most importantly, this release allows comparing two filings against one another.
Why Fech Matters
Fech is already being used by the NYT for its reporting and interactive visualizations of campaign spending. But that’s just one editorial team. Putting this tool in the hands of any developer or reporter that wants to work with the data opens a lot more possibilities.
For example, there’s ProPublica, which is using Fech and the NY Times‘ APIs for its reporting and interactive graphics. ProPublica is able to show not just what campaigns are spending, but how much and with whom. (So far the biggest winner is Mentzer Media Services, an ad agency that specializes in GOP campaigns – including the Swift Boaters. Fech doesn’t automatically point that out, of course, but it helps journalists uncover it.
Data without context is useless. By helping developers and journalists work with the filings in a more structured way, Fech helps newsrooms (or any other group) put the data in context to find the story behind the data. It’s a long way from being simple to use, but it represents a significant improvement over the raw data. It’s Apache-licensed, so it might find its way into all kinds of data analysis tools over time.
With Fech maturing well before the elections this fall, it could help all kinds of organizations follow the money trails much more efficiently. Here’s hoping that happens.