It was worth a shot. At the recent Strata Conference in Barcelona, Hadoop founder Doug Cutting took to the stage to argue for a new era of Big Data ethics.
“It’s time for us to reflect as we enter this new data age on how we want it to work,” Cutting declared. “This is the time when the practices and policies we want will be set for the coming decades.”
Cutting is right, of course. But he’s also too late. By open sourcing Hadoop under a liberal license, Cutting gave the world the rope to save or hang itself.
On the data privacy front, we seem hell bent on the latter.
Spying On The Elephant
While Big Data bad behavior isn’t remotely exclusive to government, it is the U.S. government that has turned data into a cause for concern. Against this backdrop of widespread data (mis)use, Cutting told Strata attendees that the time is now to establish principles of transparency and ethics for the coming decades of Big Data adoption and use.
“In science fiction, the people who collect the data are the bad guys,” he laughingly noted. “I don’t want to be one of those bad guys.”
Few, perhaps, do aspire to misuse data. But one person’s misuse is another’s fair use. And given that all the best Big Data technology is open source, there’s really nothing to prevent governments or private corporations collecting and using data however they see fit.
As one Quora commentator puts it, “Open source is open source and people will use it for whatever and however they want to use it. It’s hard to make a morals call.”
And why shouldn’t they? After all, not only are organizations like the NSA and CIA feverishly using Hadoop, they’re also actively helping to develop Hadoop and other Big Data technology. In fact, while the NSA used to try to build its own data tools, it now has turned to Hadoop for much of the heavy lifting on analyzing data sets on its citizens.
Some of the NSA’s modifications to Hadoop are being contributed back. Some almost certainly are not. Regardless, both jeopardize trust in government to the point, as Google executive chairman Eric Schmidt posits, “We’re going to wind up breaking the Internet.”
The People Fight Back
Concern over government and corporate spying has given rise to new open-source projects like Detekt to help consumers fight back. Detekt, launched by Amnesty International, the Electronic Frontier Foundation and other non-profits, aims to uncover “commercial surveillance spyware that has been identified to be also used to target and monitor human rights defenders and journalists around the world.”
Worried your PC or smartphone is riddled with govt spyware? There's an app for that! #Detekt launches today http://t.co/nEwo7PZn23 #CAUSE
— amnestypress (@amnestypress) November 20, 2014
It’s a nice step in the right direction, though it’s hobbled by being a Windows-only executable. Running on Windows is irony at its finest, given that Windows has long offered U.S. spy agencies a back door.
What, Me Worry?
But it’s probably not fair to single out the U.S. government—or any other—for Big Data malfeasance. After all, private corporations are only too happy to use data to fight competitors and rope in consumers.
As I’ve written before, I’ve watched my own son get hammered by data-hungry gaming companies, and I have friends whose lives have been decimated by data-mad porn companies.
Cutting wants a new era of responsibility, but the temptation to use data will almost certainly prove irresistible for companies and governments to resist. The only solution seems to be an uprising, not from the tech industry but rather from ordinary folks whose data is misused.
But for that to happen, we need to lose our addiction to free services like Gmail or Facebook (powered by Hadoop), which encourage us to contribute data so that we can have free storage, free socializing, free everything. Evgeny Morozov calls out this “disturbing trend whereby our personal information—rather than money—becomes the chief way in which we pay for services—and soon, perhaps, everyday objects—that we use.”
In sum, it’s nice to wish for a new era of Big Data ethics, whereby corporations and governments respect our privacy, but it’s hard to square that vision with the consumer’s willingness to sell her data for a mess of free services.
Lead image by takomabibelot