Google's breadth of services is truly awesome and the amount of information the company touches concerning our lives and world can sometimes feel downright frightening. While almost no one takes the old phrase "Don't Be Evil" seriously anymore now that there are billions of dollars on the table and Chinese autocrats to satisfy - regular evaluations of Google's ethical positions still seem advisable.
One of the big questions being asked with increasing frequency is this: Is Google using data it collects through particular services and using it for its benefit in other services? We know the company scans our GMail and uses the text there to sell ads, but is this a tactic being employed across services? Some people appear to believe it is.
When enterprise wiki service Socialtext announced this morning that they were folding Dan Bricklin's SocialCal (Visicalc) spreadsheet into their offerings, the announcement included this interesting customer quote:
"The timing of SocialCalc is perfect - we were in need of a wikified spreadsheet that had all of the utility of Google Docs without the datamining," remarks Brandon Stafford, Principal Engineer at GreenMountain Engineering."
We found it very interesting that a new application would specifically aim at Google's data mining as a weakness. That kind of tactic is likely to become increasingly frequent.
You might remember that was a question people asked about MyBlogLog when Yahoo! bought the widely embedded service. Was Yahoo! using MyBlogLog to spy on AdSense and other activity unrelated to their own technology?
In Google's Defense
The information available cross-application is probably too seductive for Google, or almost any company, to pass up. The search and ad giant's saving grace may be that it has so much information in each silo already that it's uniquely satisfied not cross-pollinating.
Behind every alleged conspiracy at a giant company though is just a bunch of people doing their jobs. Only occasionally, we presume, do some of them come up with what would be a great idea as long as they don't get caught.
Some cross pollination of data from one service to another might in fact be great - if users had control over it and could use the same tactic for our own direct benefit. Until that kind of data portability policy and technology are in place, though, may of us would prefer that data remains right where it is and keeps its hands in plain sight.
One of the first posts I wrote in my time at TechCrunch was about a Google experiment that would use your computer's microphone to track the ambient audio in a room, determine what TV shows you were watching and then serve up related ads in your browser. Presumably that program hasn't gone anywhere, snooping-obsessed researcher Shumeet Baluja has moved on to other research like monitoring video game players' behavior and psychology for ad targeting and watching how much porn people look at on their mobile phones.
Outside of Google's actions - data integrity (privacy) in hosted services has long been a concern and is now being responded to by some enterprise sales teams with boxes carrying applications locally behind customers' firewalls. As recently as the end of last year, SalesForce.com admitted that one of its employees fell for a phishing scam and handed over the key to that company's customer email accounts.
What if it wasn't wasn't an accident or an outside party though? What if data that was collected in "anonymous aggregate" proved just too juicy for personalization-hungry ad sales teams or security-obsessed government agencies. Do you trust Google to resist mining your data across the various Google services you use? Is avoiding "Google data mining" an effective selling point that would increase your consideration of products from another vendor? We expect that the answers to these questions will change over time and we think it would be wise to revisit them periodically.