In a recent post, I outlined a kind of layman's test for the Semantic Web. I wrote that the tipping point for the Semantic Web may be when anyone can query a set of data about a historical figure and get a long list of structured results in return. I called this 'The Modigliani Test,' after my favorite artist Amedeo Modigliani. To pass this test, you must deliver - using Linked Data - a comprehensive list of locations of original Modigliani art works around the world.

A developer named Atanas Kiryakov gave the test a good crack. In doing so, he illustrated the core issues facing the Semantic Web currently.

The challenge of this test is that there isn't currently enough linked data on the Web about Modigliani. Also the key data in this test is the locations of art works, which probably isn't one of the main data fields for art data when it's uploaded to the Web (artist name and art work title would be the two key data fields).

Kiryakov wasn't the only person who attempted to pass the test; and in fact his results mirror what can be found already on the popular open database Freebase. However Kiryakov, who is the Executive Director of Bulgarian Semantic Technology company Ontotext AD, did a great job of explaining his methodology and noting the issues he faced.

The Current State of Linked Data Queries

The result of Kiryakov's attempt is a relatively short list of locations of Modigliani paintings around the world. He admits that the list isn't long enough, but says that it's the closest he could get - not just because of the limited amount of data in the Linked Data Web, but because it's "hard to query and use today."

Essentially Kiryakov created code to query a few known Linked Data sets, with custom manipulations to output location data. This is what he came up with:

PREFIX fb: <http://rdf.freebase.com/ns/>
PREFIX dbpedia: <http://dbpedia.org/resource/>

PREFIX dbp-prop: <http://dbpedia.org/property/>
PREFIX dbp-ont: <http://dbpedia.org/ontology/>
PREFIX umbel-sc: <http://umbel.org/umbel/sc/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ot: <http://www.ontotext.com/>
SELECT DISTINCT ?painting_l ?owner_l ?city_fb_con ?city_db_loc ?city_db_cit
WHERE {
  ?p fb:visual_art.artwork.artist dbpedia:Amedeo_Modigliani ;
     fb:visual_art.artwork.owners [ fb:visual_art.artwork_owner_relationship.owner ?ow ] ;
     ot:preferredLabel ?painting_l.
     ?ow ot:preferredLabel ?owner_l .
  OPTIONAL { ?ow fb:location.location.containedby [ ot:preferredLabel ?city_fb_con ] } .
  OPTIONAL { ?ow dbp-prop:location ?loc. ?loc rdf:type umbel-sc:City ; ot:preferredLabel ?city_db_loc }
  OPTIONAL { ?ow dbp-ont:city [ ot:preferredLabel ?city_db_cit ] }
}

That query was executed in a tool called LDSR, a "Linked Data Semantic Repository" created by Kiryakov's company Ontotext. He calls LDSR a "search engine for part of the linked data web." Ontotext's LDSR includes data from existing Linked Data repositories such as DBPedia, Freebase, Geonames, UMBEL and Wordnet.

Here is a screenshot of Atanas Kiryakov's attempt to pass the Modigliani Test. He spent over an hour formulating the code used to generate this result.

As you can see, the resulting list was just 8 items long and most of the locations are in major U.S. cities. This falls well short of a comprehensive list of Modigliani art work locations. For example, there's no data about Modigliani paintings in Europe - where Modigliani lived all his life.

Other Sources of Modiglidata

Kiryakov wrote that most of the data returned in the Modigliani example came from Freebase. Indeed, as RWW commenter Brian Karlak pointed out in our original post, you can get much the same result within Freebase itself. Another commenter, Michael, pointed to a non-technical results page. Kiryakov's result has a little more data, but not much more.

However the point of Kiryakov's attempt and blog post was to point out the difficulty of passing the Modigliani Test right now. He noted that "getting useful information from LOD [Linked Open Data] quite often requires a lot of efforts to analyze and post-process them in order to get reasonable answers to structured queries." In other words, it's much more than just inputting a natural language query (note that the Freebase example was provided by a user there named masouras, so it's not something an average user could do).

I should also mention that in the comments to the previous post, Bruce Wayne pointed to his company Factoetum's effort to pass the test - which had 7 results, including some different ones to Ontotext/Freebase. Like Kiryakov, Wayne noted that it's "nearly impossible" for non technical people to use the current solutions.

Finally, to address an issue that some commenters raised in the previous post: yes it would be possible to pass the Modigliani Test with some manual human effort to track down location data. But that's cheating - we want to see this done using Linked Data. And not just for Modigliani works, but for any other artist.

Much Work to Be Done

Atanas Kiryakov concluded that "there is still a lot of work to be done, because we cannot expect wide usage and interest in the Semantic Web if writing such a query takes more than an hour and a lot of technical knowledge."

While that's true, I thank Atanas for giving the Modigliani Test a crack. At least now I know to visit the Museum of Modern Art when I next go to New York!

Let us know your thoughts on the Modigliani Test in the comments. Or perhaps you're a developer willing to take on this challenge?