Good overview article in Tech Review on
Tim Berners-Lee's quest to create the "Semantic Web": an interconnected maze of meaningful data that can be mined by software applications much easier than the eyeball-focused WWW.
While it would be great if it worked, fundamental obstacles remain. When asked for an example of a working "phase 2" application of the Semantic Web, Berners-Lee points to the Friend of a Friend (FOAF)
RDF format. But follow the link to the
form-based FOAF creator and this quote stands out:
The 'discovery' aspect of FOAF (i.e. how FOAF compliant applications find your description) is still an area under discussion.
Isn't discovery
the problem the Semantic Web is designed to address? Common, machine-parsible data formats are a solved problem (see
XML, or
RDF for that matter). No one, however, has yet produced a real-world method for multiple client applications to discover and aggregate multiple data sources without the kind of human-arranged connections that don't scale with either the number of sources or the number of data formats.
In fairness, this isn't just a limitation of the Semantic Web. Much bally-hooed "Web Services" remain just the latest incarnation of
RPC (this time over HTTP) until the discovery problem is solved.
In lieu of a real discovery scheme, the FOAF page has several recommendations to ensure your RDF file is properly indexed by
Google. Google, arguably the most successful application on the web, doesn't understand a lick of semantics, but does a great job of information discovery purely through statistical analysis of link patterns.
I think the applications that will provide us with the most benefits will be those that, like Google, operate by throwing lots of data at simple algorithms, rather than relying on the holy grail of a scalable, semantically-aware discovery protocol.
I have some thoughts on what one of those applications might look like... stay tuned.