Google is creating a global database of child abuse images that the company hopes, when shared with other search engines, will help eradicate child pornography from the Internet. While this is certainly a goal worth fighting for, sadly it is also a goal that is out of reach.
Given that Google shared their new program with U.K. publication The Telegraph, Google was certainly responding to increasing political pressure from the U.K., most notably Prime Minister David Cameron’s remarks on June 10 that called Google and other search engine companies out for enabling the proliferation of such images on the Internet.
The new program certainly sounds promising: the company will be working to create a database of flagged images within a year’s time that will be shared with other search engines in the hopes that such content will “be wiped from the web in one fell swoop,” the Telegraph proclaimed.
Unfortunately, the end of all child pornography is not going to be the result of such a program, as images of raped and abused children will not be eliminated from the Internet but – at best – far less likely to come up in search results on Google, Bing, Yahoo and Ask.
Google Giving Director Jacqueline Fuller detailed the program with far less hyperbole on Google’s official blog Saturday:
Since 2008, we’ve used “hashing” technology to tag known child sexual abuse images, allowing us to identify duplicate images which may exist elsewhere. Each offending image in effect gets a unique ID that our computers can recognize without humans having to view them again. Recently, we’ve started working to incorporate encrypted “fingerprints” of child sexual abuse images into a cross-industry database. This will enable companies, law enforcement and charities to better collaborate on detecting and removing these images, and to take action against the criminals. Today we’ve also announced a $2 million Child Protection Technology Fund to encourage the development of ever more effective tools.
Even if these horrific images are identified, that doesn’t automatically remove them from the Internet. It takes law enforcement and Internet service provider intervention to do that, as Filler stated.
And there’s the fact that, for all their power, Google and the other search engines do not have the entire Internet tracked. Estimates vary wildly, with some guessing that Google may have up to 12% of the Web indexed, and others pegging that percentage as low as 0.04% of total Web content.
Whatever the figure, no one would ever give any of the search engines out there the credit for indexing the entire Web. Nor will the search engines ever get there, at least not the way they work now.
Search engines rely on following links to new content on the web. So, if a site containing illicit content is not linked to any other site, the search engines won’t even know it’s there.
And, even if they were able to find the site, search engines still abide by a site’s robots.txt file, something that all automated web search crawler engines examine before stepping across a site’s threshold. If the robots.txt file says no search engines allowed (and there are various legitimate reasons why an administrator might want to keep such crawlers out), then there’s no indexing that will happen.
Google could, in the interests of hunting down illicit content, ignore the robots.txt restriction, but busting that honor system would negatively impact a lot of sites that have done nothing wrong.
Content can also be hidden on sites by putting it behind forms. Search engines don’t index pages that are created when a form is filled in and then auto-generated by the content of that form. If they did, then search results would be inundated with product catalog content every time we looked for men’s shirts.
To be clear, Google’s program is a strong step in making it harder to find child pornography on the Internet – and that’s a damn good thing. But sources that are known by purveyors of this content will still be available to provide material that exploits children. All the search engines are doing is making it harder for new searchers for this content to locate such content.
In the long run, politicians and citizens should be happier: if this program is successful, child pornography will be vastly decreased from easy public view. But this will be just a Potemkin village – a clean-looking version of the Internet that will not reflect the fact that these terrible images are still out there – just better hidden.
Image courtesy of Shutterstock.