A bipartisan bill limiting what companies can do with online user activity and profile data may be introduced tomorrow by Senators John McCain and John Kerry, according to reporting first in the Wall St. Journal and then today on marketing news site Clickz. The Journal's Julia Angwin, citing anonymous sources, reports that the bill will require that sharing of user data between companies be opted-into by users and that users be able to see what data about them is being shared.

That might not sound so bad on the surface, but in a new world of fast-developing technology - it's good to think hard before making laws based on what might seem like common sense. The internet is a young thing and legislation like this could cut deep. Leadership on the issue from John McCain, who less than 3 years ago thought it appropriate to run for the Presidency without ever having used the internet before, seems particularly inappropriate. This is an issue that needs to be looked at from a pro-technology perspective, at least in part.

Data flying from point to point, out of your sight, only somewhat under your control, until magic happens - isn't that the nature of the Internet? And isn't using it at all opting-in to redistribution of data? That might be too philosophical. There's a practical story here of old-fashioned invention, too.

Websites "spying" on you? That's so 2001. The European Union this week singed a privacy and data protection plan for offline objects now being tracked with RFID chips. That's some good privacy talk!
Requiring opt-in, instead of opt-out, for data sharing would likely greatly reduce the amount of user data available for sharing, analysis and use in creating new software and services. While unpoliced data sharing clearly frightens many people (where are the victims of these crimes?) - the consequences of stifling data sharing by industry are more tangible. [Note that data privacy violations can harm real people but there's a difference between showing private info to other individual people and machines processing personal info in bulk. It's not the mysterious machines to be afraid of, it's the real-live creeps you actually know.]

"These regulations," says leading social network data hacker and ReadWriteWeb contributor Pete Warden, "will deter startups from building new tools like Mint.com or Rapportive, while the big corporations can devote whole departments to working around any new rules."

Handling a Tidal Wave

The consequences of such legal action are hard to foresee. "The tricky question is, what is Personally Identifiable Information?" says Warden. "Everyone wants to just regulate names, addresses, etc. but since you can deanonymize almost any user-generated data set, and derive that information...any regulations will end up affecting far more applications than you might expect."

Indeed, data is widely expected to become one of the key factors in the future of economic and social development. And so much data is personal. (There's a whole lot of personal data about all of us collected and shared off-line too, at the grocery store for example, or in direct marketing databases - but that's not the subject of so much ire.)

This is a common theme here on this blog. The example I've offered most commonly in calling for data to flow as freely as possible is the history of what's called real estate redlining. In the 1960s, when both U.S. Census information and real estate mortgage loan information were made available for bulk analysis, it was proven that banks around the U.S. were discriminating against home loan applicants in traditionally African American neighborhoods.

That was a big deal and I suspect that there are patterns of comparable importance, both positive and negative, hiding in the huge flowing river of online user data.

Dr. Dirk Helbing, of the Swiss Federal Institute of Technology, chairs the team building a project called the Living Earth Simulator (LES), a massive data project aiming to simulate as much natural and social activity on earth as possible. Those simulations, to be carried out on a scale inspired by the Large Hadron Collider, would aim to discover all kinds of patterns hidden in the mass of human and ecological data, including social network data.

Here's how he explains the importance of data analysis for pattern recognition. "Many problems we have today - including social and economic instabilities, wars, disease spreading - are related to human behavior, but there is apparently a serious lack of understanding regarding how society and the economy work," he says. "Revealing the hidden laws and processes underlying societies constitutes the most pressing scientific grand challenge of our century."

Data analysis uncovered systematic racial discrimination in housing loans in the 1960's. In the future, analysis of the incredible living census that is our internet data could be used to discover patterns and opportunities relevant to global warming, overpopulation, the spread of disease and the fact that the world today is an awful, unfair mess.

Setting the Tone

The US Federal Government, in discussing the issues around online data and privacy, always mentions the need when taking government action regarding privacy to safeguard the incredible potential for innovation in all this data.

The Wall St. Journal's extensive reports on these matters make no such effort. It's remarkable that the paper of record for capitalism makes no serious gesture in its reporting on data privacy to recognize the incredible economic engine that is online data. Instead, the publication's tone is fear-mongering and self-congratulatory. (A code on Ashley's computer knows that she likes the movie The Princess Bride and that information is sold to other companies for 1/10 of 1 cent...'Well, I like to think I have some mystery left to me," Ashley says, "but apparently not!" Poor woman! No mystery left!)

I hope that McCain and Kerry don't introduce a bill requiring online user data sharing to be opt-in only. If they do, I hope there's a lot more conversation and learning than there is legislating.

No one puts it better than the US Department of Commerce. That body said the following in its announcement of the new Federal Privacy Policy Office in December:

Strong commercial data privacy protections are critical to ensuring that
the Internet fulfills its social and economic potential. Our increasing use of the Internet generates voluminous and detailed flows of personal information from an expanding array of devices.

Some uses of personal information are essential to delivering services and applications over the Internet. Others support the digital economy, as is the case with personalized advertising.

Some commercial data practices, however, may fail to meet consumers' expectations of privacy; and there is evidence that consumers may lack adequate information about these practices to make informed choices. This misalignment can undermine consumer
trust and inhibit the adoption of new services. It can also create legal and practical uncertainty for companies. Strengthening the commercial data privacy framework is thus a widely shared interest.

However, it is important that we examine whether the existing policy framework has resulted in rules that are clear and sufficient to protect personal data in the commercial context.

The government can coordinate this process, not necessarily by acting as a regulator, but rather as a convener of the many stakeholders--industry, civil society, academia--that share our interest in strengthening commercial data privacy protections. The Department of Commerce has successfully convened multi-stakeholder groups to develop and
implement other aspects of Internet policy. Domain Name System (DNS) governance provides a prominent example of the Department's ability to implement policy using this model.

Convening multi-stakeholder conversations between diverse industry and other experts. That sounds like a much better idea than passing laws that cut so deep, here so close to the dawn of the internet.