In the opening scene of the 2008 thriller Eagle Eye, the heroes attempt to conceal their conversation from the microphones of an omnipresent computer. Unfortunately for them, the evil machine eavesdrops by zooming in on the vibrations in a nearby glass of water.
This scenario isn’t so far-fetched now that researchers at MIT, Microsoft and Adobe have developed a computer algorithm that can reconstruct sound from visual information. In one instance, they were able to recover intelligible sound from the vibrations of a potato chip bag photographed through soundproof glass 15 feet away.
“This is totally out of some Hollywood thriller,” Alexei Efros, an associate professor of electrical engineering and computer science at the University of California at Berkeley, told MIT News. “You know that the killer has admitted his guilt because there’s surveillance footage of his potato chip bag vibrating.”
In the researchers’ paper, The visual microphone: passive recovery of sound from video, the group outlines their experiments videotaping everyday objects including a glass of water, a potted plant, a box of tissues, and a bag of chips, and their success at capturing sound from each of them. Not only could the researchers recover the actual speech, (in this case, the words “Mary had a little lamb,”) but also identify the gender and number of speakers.
The forensics and surveillance applications to the technology are obvious, but Efros said its most valuable use may of yet be undiscovered.
“I’m sure there will be applications that nobody will expect,” he said. “I think the hallmark of good science is when you do something just because it’s cool and then somebody turns around and uses it for something you never imagined.”
Feature image by carterse.