Each night this week while making dinner, I’ve listened to a different podcast interview with a semantic technology professional. It’s been a fascinating experience and the man I have to think for the education is Dr. Paul Miller, technology evangelist for semantic web platform vendor Talis. Miller is producing an informative, enjoyable and prolific series of hour-long conversations with some people whose work is simply amazing.
Semantic web tools are no longer trapped in the lab, after years of research they are becoming products and entering the public market. There is semantic technology work underway at Skype, Joost and the BBC, just to name a few brand names. Last month we called Twine possibly the first mainstream semantic web app.
I think of it like this: Once our software is capable of deriving semantic meaning from web pages it looks at for us – then there’s a whole lot of work that will already be done, allowing our human, creative minds to save time and reach new heights.
Semantic technology is a subject of great interest to many of our readers here and elsewhere; we’re proud that our post from yesterday highlighting 10 semantic companies to watch (including Miller’s Talis) hit the front page of Digg this morning, taking this look into the future to an even larger audience.
I didn’t know Richard MacManus was writing that post last night, but coincidentally I was already working on this interview with Dr. Miller. I’m very thankful that he took the time to help us dive deeper into the topic.
Marshall Kirkpatrick: What’s the elevator pitch on what the semantic web is and what promise it holds?
Dr. Paul Miller: I hope it’s quite a tall building we’re riding up in this elevator, as the Semantic Web offers a wide range of opportunities moving forward. Talis Platform Advisory Group member Mills Davis of Project 10X is about to release his ‘Semantic Wave 2008’ report, and that does a great job of illustrating just how broad the potential for semantic technologies could be.
Reaching right back to that famous Scientific American article in 2001, there’s been a tendency to paint a grand vision for The Semantic Web that encompasses a plethora of devices calling upon powerful reasoning and data mining capabilities to deliver a seamless, intelligent and unobtrusive end-to-end service to the end user. That vision is an interesting one, but still some way off.
Pared back down to its essentials, the Semantic Web is fundamentally about expressing connections between resources in ways that can be interpreted and acted upon by software. Tim Berners-Lee reaches back to some of those fundamentals in his blog post this month on the Giant Global Graph and as I remarked at the time …
“This is the long-held promise of the Semantic Web, but it is valuable to see that promise rearticulated in something akin to the language of the social network. Those involved in the Semantic Web probably ‘knew’ all of this at some level, but had perhaps become too caught up in the mechanics and the model, too distant from the point. This is why the Semantic Web matters; the graphing of relationships between resources on the open Web. Not ontology wars. Not RDF-is-better-than-microformats. Not demonstrations of concept in the laboratory and behind the firewall. Not the creation of a shadow web. This.”
Marshall: What does Talis do?
Paul: At Talis, we’ve been dealing with rich structured data for almost forty years. More recently, we’ve been devoting our attention to the intriguing opportunities created by what my colleague Justin Leavesley recently called the “connectivity disruption“. We see broader trends around increased connectivity, falling network and storage costs, and the increasingly rich web of connectedness amongst resources on the open web. All of these come together to create an intriguing point-in-time opportunity, whereby there’s a real opportunity to switch the way that we think about building applications on the web and to move toward Tim O’Reilly’s notion of the ‘Internet Inside’. Instead of continuing to build applications that essentially consider the Web as an external source upon which to draw from time to time, we shift the model and construct a new generation of applications that truly, natively, consume content from across the Web… and it doesn’t matter whether than content is open and public, or the sorts of resources that an enterprise would traditionally keep locked up in some database well inside the firewall.
The Talis Platform (www.talis.com/platform/) lowers the barriers to creating these applications, by hiding some of the underlying Semantic Web complexities and simply offering capabilities to store, query and manage heterogeneous data via a set of consistent, simple, and very Web-like APIs. The Talis Platform itself is available for third party developers to work with, but we also build applications of our own on top of it, using exactly the same APIs as anyone else could. Talis Engage is the first of those new commercial applications to see the light of day, but there are plenty more to come.
As well as making extensive use of semantic technologies in our Platform and the applications we build upon it, we’re also invested in the wider success of the Semantic Web and those relying upon it. We’ve been members of W3C for a long time, and contribute to W3C activities such as GRDDL and SWEO. Through our Talis Platform Advisory Group we have assembled a great team of leading lights in the practical deployment of semantic solutions… and we work together to raise awareness and to build a market together, to the future benefit of ourselves and others. Our podcast series, too, is an attempt to lift the lid on some of the great work that is being done with the Semantic Web, and to pull in relevant views from those such as venture capitalists who might seek to fund future Semantic Web startups.
Marshall: Our own Alex Iskold is a big proponent of what he calls the “top down semantic web,” a strategy centered on outside eyes analyzing the contents of a web page to determine its meaning without depending on the page’s author to take steps (like deploying microformats) to communicate a page’s meaning themselves.
Your recent interview with Yihong Ding seemed to articulate one interesting “top down” approach. Iskold says that a top down approach is the only type that will prove feasible on any large scale. What do you think of this approach?
Paul: I commented on Alex’ post at the time, and certainly welcomed the profile-raising that his series of posts provided for these issues. I certainly agree that we’re not going to get very far on the open web if the onus is on the creators of content to do all the work to make it semantically rich or meaningful. However, as I suggested at the time, I’m not sure that I recognise the dichotomy that Alex painted, and think that the reality is an odd mix of top-down and bottom-up.
I certainly would agree that there’s an awful lot that can be done by your ‘outside eyes’ (and they’re probably the eyes of a machine, rather than a person) to infer pattern, meaning and structure from the ‘non-semantic’ resources with which today’s web is filled. Which takes us neatly back to Tim Berners-Lee and the GGG… 😉
Inside an enterprise, things may be different. There, a degree of rigour and structure certainly can be introduced, and some of the work on ontologies and taxonomies (see, for example, my podcast with Bill Hutchison this week) holds a huge amount of potential… if we can provide the right sort of tools to ensure that this addition of structure becomes a sensible and low-hassle aspect of the existing workflow.
Marshall: In your interview with European semantic web consultant Alberto Reggiori, Reggiori says that in seeking to get permission to build out semantic web technology in a company’s infrastructure, it is best that it not be a primary point of discussion, but that it be wrapped in a framework of more familiar technologies. I know that some people have said the same thing about other technologies, like RSS for example – that widespread adoption will require obfuscation of the technology itself (“just don’t tell them it’s RSS!”). What do you think of this approach with semantic technology? Reggiori talked about doing work for Skype, Joost and the BBC. Where else are semantics hiding that would be of interest to our readers and lend further credibility to the field?
Paul: I’d definitely agree with Alberto on this one. The Semantic Web shouldn’t be seen as an end in itself. Rather, it’s a set of attitudes, technologies and philosophies that can be usefully applied to the solving of real-world problems faced by real people. The trick, surely, is to offer solutions to those problems… in the language of their stakeholders… rather than marching into a company and saying “You need the Semantic Web” ? The Semantic Web toolset is part of any solution… it’s not all of it.
Marshall: How would you describe the state of the semantic web?
Paul: About to get really interesting? There’s been Semantic Web research inside universities and the R&D arms of big companies for years. It’s only in the past year, really, that we’ve begun to see an emergence of end-user facing applications in which semantic technologies play a significant part. The Analysts are sitting up and paying attention. The venture capitalists are starting to throw money around. Richard’s post on RR/W earlier today highlighted ten examples of companies embracing semantic technologies, and there are plenty more. I think we’ll see a lot of growth and a lot of interest into 2008, and it’s going to be fascinating to track, and to be part of that.
Marshall: If the semantic web is one of the bleeding edges of online technology (web 3.0, some people say) – what are some of the most experimental directions being explored within the semantic web community itself?
Paul: It’s probably best not to go down the whole “Is the Semantic Web the same as Web 3.0?” thing here; that’s a whole separate article in its own right! 😉 Some of Nova Spivack’s (CEO of Radar Networks, [makers of Twine] and another member of our Advisory Group) ideas around the ‘third generation of the web’ are interesting here, though.
Some of the areas that interest me most are those to do with understanding and following links across diverse sets of data out on the open web – hence our interest in licensing, which Richard mentioned in his piece. SWEO’s Linking Open Data community effort is one of the interesting exemplars here… they’ve been approaching a wide range of data-holding organisations (such as Wikipedia), and working with them to expose that data for use and reuse. Where it exists in some less than friendly format, they’re also helping to transform it into something like RDF… where it becomes far more useful.
Marshall: I thank Paul Miller for his time, I’m happy to make his acquaintance and I hope you’ll check out some of the resources he points to above. The world ahead of us is an exciting one, and it’s approaching quickly.