Today at the Strata conference The Stanford Visualization Group debuted a Web-based visual tool for cleaning up messy data called DataWrangler. According to its website, “Wrangler allows interactive transformation of messy, real-world data into the data tables analysis tools expect.” Data can be exported as a CSV or TSV or as JSON data.
Data wranglers can use the tool with the group’s data visualization tool Protovis, or with tools such as Excel, R and Tableau.
Another thing I often hear is that a large fraction of the time spent by analysts — some say the majority of time — involves data preparation and cleaning: transforming formats, rearranging nesting structures, removing outliers, and so on. (If you think this is easy, you’ve never had a stack of ad hoc Excel spreadsheets to load into a stat package or database!)
Putting these together, something is very wrong: high-powered people are wasting most of their time doing low-function work. And the challenge of improving this state of affairs has fallen in the cracks between the analysts and computer scientists.