reported this morning, the long-awaited and much hyped natural language processing search engine Powerset launched this morning. Kind of. For now, the search service only uses Wikipedia and Freebase as source material for answers to your query. So it's not really fair to compare it to Google yet, but this is a search engine, and that means it will always be held to the gold standard set by the market leader.As our network blog AltSearchEngines
Comparing the two is tricky, since Google searches the entire web and Powerset only processes two sites. The admittedly not very scientific method that we came up with was to compare a handful of searches on Powerset, to the results for the same query on Google restricted to "site:wikipedia.org."
Powerset does some interesting things with general queries, such as displaying "Factz," which is an ontology showing various concepts related to your query and how they relate to one another, or "Dossiers," which are a summary of key information about your query. Sometimes it yields some odd results (such as this query for "ants" for which the key finding is that ants are "a fictional race from the video game Crash Twinsanity.") However, the real promise of NLP search engines, in our opinion, is that users will be able to make search queries using natural language -- or in other words, by asking a question. So we chose a few questions at random -- things we knew Wikipedia would have answers for -- and threw them at both Powerset and Google.
Query: Who invented dental floss?
Powerset's answer for this query was curious. The number one result comes from the Wikipedia entry for dental floss and highlights this line: "It was around this time, however, that Dr. Charles C. Bass developed nylon floss." Charles Bass, however, is not the correct answer. Earlier in the same article is this line, "Levi Spear Parmly, a dentist from New Orleans, is credited with inventing the first form of dental floss." Why didn't Powerset find it? It's second results, which comes from a Wikipedia entry on scientific achievements from the year 1815, correctly highlights Parmly as the inventor.
Google performed poorly for this query. The same 1815 article is identified in the sixth spot on the results, with the sentence mentioning Levi Spear Parmly highlighted, but the first few results aren't even close. Even though that's not as impressive as Powerset's results, both would require a user to click through to the article to verify the answer (because Powerset returned two different answers), and is scrolling to the 6th spot really that taxing? Taxing enough to make you switch to a new search engine? Interestingly, this query set loose on all of Google does quite well, returning the correct answer in a link to a trivia site in the first result.
Query: What is the capital of France?
Not surprisingly, both Google and Powerset nail this one. Both point to the Wikipedia entry on Paris, France in the number one spot with the sentence, "Paris is the capital of France" highlighted.
Query: Where is Paris?
This is a fundamentally more challenging query, because there are a large number of cities and towns called "Paris" in the world. And not surprisingly, neither search engine gives what we would call a "perfect" result.
Both return the article on Paris, France first. On Google, that's followed but a handful of other articles about the city and one about Paris, Tennessee. On Powerset, the second article is about Paris Hilton -- um? -- followed by one about Paris, Texas, and in fourth place the most helpful article it could have returned, the disambiguation page on Wikipedia for Paris. (Oddly, with the question mark, the query returned "Paris, Missouri" from Freebase, and without the question mark it returned "Paris, Texas.")
On Google at large, the results focus almost exclusively on Paris, France.
It would seem that both search engines generally understand that "where is Paris" means that Paris is a place (though upon reflection, perhaps we could have been searching for the location of Paris Hilton...), but neither recognize very well that it could mean any number of different places.
Query: Who is Joey Tribbiani?
Both Powerset and Google correctly call up the article about this fictional character in their first spot, but Google actually does a better job of highlighting who he is. Compare:
- Google: After the 2003/2004 final season of Friends, Joey Tribbiani became the main character of Joey, a spin-off TV series, where he moved to L.A. to polish his ...
- Powerset: In the end of the series, Joey was the only Friend that ended up without a lover or a spouse, even though he is the one that dated the most women. ... Joey becomes good friends with an attractive female attorney named Alex, who, along with her husband, a travelling [sic] musician named Eric, is Joey's landlord.
Google has the name of both shows in which the character appears in their excerpt, while Powerset's excerpt is made up of information about the series' that only someone who already knew the character would understand (without clicking through to read the full article) -- and it doesn't differentiate between the two -- before the ellipses the excerpt is talking about "Friends" and after it is talk about "Joey."
Google at large also finds the Wikipedia article first with the same excerpt -- it also finds clips of the show on YouTube, and the actor's (Matt LeBlanc) IMDB entry, as well the official site for the spin-off "Joey."
This was really just a very quick and informal test, and we barely put Powerset through its paces. But our first snap impressions are that Powerset doesn't do a markedly better job of finding answers than Google for most queries. Some might argue that we didn't play to Powerset's strengths and frame our queries properly, or search for things obscure enough to notice any differentiation. But the promise of natural language search is that people don't have to learn how to search -- they can just ask questions as they normally would. We also can't expect that everything they're going to look for will be obscure and hard to find via traditional search engines -- more often than not, they probably won't be.
Powerset will have an immense uphill battle to make any sort of dent in the search market. Google controls 67% of searches in the US, and the top 4 search engines make up about 98% of searches. If Google remains "good enough," Powerset will have a hard time convincing people to switch. It will be easier to make a judgment about the company's future as a real Google competitor once it is crawling more than two sites, however.
What do you think about Powerset? Impressed? Not impressed? Let us know in the comments below.