<?xml version="1.0" encoding="UTF-8" ?>
<rss xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
        <channel>
        <title>Alexander Korth - ReadWrite</title>
        <link>http://readwrite.com</link>
        <description />
        <language>en</language>
        <copyright>Copyright 2012 SAY Media, Inc.</copyright>
        <managingEditor>readwriteweb@gmail.com</managingEditor>
        <docs>http://blogs.law.harvard.edu/tech/rss</docs> 
        <lastBuildDate>Mon, 07 May 2012 05:00:00 -0700</lastBuildDate>
        <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://rww.superfeedr.com/" />

                    <item>
                <title><![CDATA[On Privacy in Social Networks: The Provider's Perspective]]></title>
                <description><![CDATA[
                                        <p><em><strong><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/shutterstock_login.jpg" style="" />
			</span>
Editor's note</strong>: This is the second story in a three-part series by Alex Korth on privacy. Read the first post: <a href="http://www.readwriteweb.com/archives/on_privacy_in_social_networks_what_drives_users.php">On Privacy in Social Networks: What Drives Users?</a></em></p>
<p>Most of the time, providers of social networks are commercial enterprises. Developing, bootstrapping and running a social network comes with very high costs, but most services do not charge their users and instead choose different revenue streams. Unfortunately, the meanings and consequences are not questioned by many users. As Andrew Lewis famously wrote, "If you're not paying for something, you're not the customer; you're the product being sold."</p>
<p>As mentioned before, most providers of social networks (especially non-business social networks) run their services as ecosystems to generate content and knowledge around their users with the intention of attracting advertisers and application developers. Since only a minority of users contribute the majority of content, providers carefully optimize their sites such that the flow of this valuable user-generated content is maximized. This means that providers intend content to have as many recipients as possible. But this strategy is quite contrary to the goal of most users, who need transparency and control over the reach of their content and information. As a result, there is a need to limit the audience of posts, especially when it comes to private content. These contradictory goals lead to the following problem areas on the provider side.&nbsp;(See "<a href="http://www.readwriteweb.com/archives/on_privacy_in_social_networks_what_drives_users.php">On Privacy in Social Networks: What Drives Users?</a>"&nbsp;to view the problem sources on the user side.)&nbsp;</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/problems_overview_pt_2.png" style="" />
			</span>
</p>
<h2>3: Privacy Theatre</h2>
<p>Privacy Theatre describes what's observable when reading provider statements in the press: What's promised and what's provided are sometimes not the same. There is a trend to escape criticism from privacy fundamentalists by providing features to control privacy, as <a href="http://preibusch.de/publications/social_networks/privacy_jungle_dataset.htm">research has shown</a>, whilst making these controls hard to use and find. Also, privacy policies are usually not read or understood by ordinary users. No wonder: They are often obfuscated by legal jargon.</p>
<h2>4: Misunderstood Reach</h2>
<p>Following the before-mentioned contradiction of party goals, it is no wonder that on many networks, it's not quite clear for users where content flows. Since, in contrast to the offline world, every provider defines its own rules of information flow, the truth is a combination of settings, friends and privacy policy, and consequently is hard to comprehend. With the Privacy Theatre in mind, I assume that providers could easily offer clearer transparency. However, the critical aspect of missing transparency is that without it, all offered control mechanisms are less effective.</p>
<h2>5: Absence of Control &amp; Oblivion</h2>
<p>The lack of transparency concerning the content reach for the sake of information flow often comes with a vacuum of easy-to-use controls for users to steer the accessibility of their information. This implies a definable lifetime and an easy way to delete information, which is featured rarely. Controls are often dislocated somewhere in deep menus and deletion mechanisms often take several clicks and/or do not really erase things.</p>
<p>"We assumed the digital footprints we left behind - our clickstream exhaust, so to speak - were as ephemeral as a phone call, fleeting, passing, unrecorded... [In fact,] our tracks through the digital sand are eternal." <em>- Tom Zeller Jr.</em></p>
<h2>6: Secondary Privacy Damage</h2>
<p>Without a proper transparency of the consequences of activity, users inadvertently threaten other individuals’ privacies without their knowledge. Common examples include:</p>
<ul>
<li><strong>Uploaded address books:</strong>&nbsp;Sure, it is a handy way to have a provider find friends, but often, these addresses are not deleted on the other side. Private information and social connections are disclosed, without any permission of the threatened individual.</li>
<li><strong>Identity linking:</strong>&nbsp;By, for example, tagging a user in a photo, sensible information may be revealed.</li>
</ul>
<h2>7: Security &amp; Data Protection</h2>
<p>One should not forget that there are additional technical aspects that might get problematic with respect to user privacy. The aspects include:</p>
<ul>
<li>Threats and risks through wrongdoers accessing private information.</li>
<li>Security flaws and implementation errors from the side of the social network.</li>
<li>Social graph privacy threatened through computational attacks to private information via the friendship connections and group membership of the user.</li>
</ul>
<p>The last and final part of this post series takes us to a meta-level: What problem areas arise from third parties coming into play?</p>
<p>Now that you've seen seven of nine problem areas of my taxonomy: Did I miss something? Please leave your thoughts in the comments!</p>
<p><em>Login photo courtesy of <a href="http://shutterstock.com">Shutterstock</a>.&nbsp;</em></p>
                    ]]></description>
                <link>http://readwrite.com/2012/05/07/on-privacy-in-social-networks-the-providers-perspective</link>
                <guid>http://readwrite.com/2012/05/07/on-privacy-in-social-networks-the-providers-perspective</guid>
                <category>Privacy</category>
                <pubDate>Mon, 07 May 2012 05:00:00 -0700</pubDate>
                <author>Alexander Korth</author>
            </item>
                    <item>
                <title><![CDATA[On Privacy in Social Networks: What Drives Users?]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/enterprise/assets_c/2009/07/privacy-photo-thumb-150x154-7024.jpg" style="" />
			</span>
<i><b>Editor's note</b>: This is the first in a three-part series by Alex Korth on privacy. In the next post, he will cover problem areas that originate from provider goals and market mechanics.</i></p> 

<p>To date, we witness the mass adoption of social networks. Roughly every 10th citizen of this planet uses these services to communicate with others. For the satisfaction of human need like socialization and self-esteem, users visit these services - very often more than daily. In communication, regardless of online or offline, people put their privacies at risk for some benefit. </p>

<p>In the offline world, we learned since our childhood how to do this properly with respect to the culture we live in. We learned how physics of the world around us work: We know when spoken word is recorded or who can see us communicating with someone. For most given communication situations, we perceive a level of transparency by sensoring the surroundings to control the receivers for what we want to say. </p>
<p>For instance, we know how loud to speak in a crowded, noisy room so that only our communication partner gets us. We also know that a postal service's personnel will be able to read a postcard we send. </p>

<p>To communicate in a social network, we intuitively try to adopt learned social norms from the offline to the online world. Unfortunately, the transparency and tools for control we need to maintain our privacy do not find equivalent counterparts there.</p><p>Compared to everyday communication offline, social networks bring a new party into play: the providers. The fact that providers can freely define their platforms' rules for communication is one reason for many of the problem areas highlighted in the following. Commercial providers run social networks as an ecosystem to generate <a href="http://oreilly.com/web2/archive/what-is-web-20.html">content and knowledge</a>. That is content about users and other things like locations or photos. From that content, knowledge can generated and monetized, such as to run targeted ads. This ecosystem must be ensured to remain attractive to its users. Otherwise, they would stop to revisit it. </p>

<p>However, <a href="http://www.useit.com/alertbox/participation_inequality.html">Nielson found</a> users to vary in heavy contributors, intermittent contributors and lurkers. This inequality holds more true the harder a feature is to handle. For instance, to found and populate a group is way harder to do than to <i>like</i> something, which requires a single click only. Hence, providers design their platforms' rules for information flow so that this rare, valuable content can be spread as broadly as possible. Any tool that chokes the flow of user content is counter-productive to this goal.</p>

<p>The users of a social network want to satisfy human needs. Therefore, we expose some of our personal information to, in return, receive a satisfaction. Like in the offline world, we need tools that provide us with transparency and control of the audience for our content. The problem is as simple as controversial: as users want to control the reach of their personal stuff, which is most likely a limiting need, providers want to spread user-generated content as wide as possible to keep up the heartbeat of their products.</p>

<p>In this three post series, I want to deal with the problem areas involved in this field. As highlighted in the below graphic, this post concentrates on the users' decision-making process as well as their behavior as to befriending other users and its consequences.</p>

<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/privacy_korth1.jpg" style="" />
			</span>
</p>

<h2>1: Privacy Balance and the Privacy Paradox</h2>

<p>The privacy balance is something we control all of a sudden and both online and offline. As <a href="http://www.amazon.com/Privacy-Freedom-Alan-F-Westin/dp/0370013255">Alan Westin observed</a>, every time we are communicating or in public, we make adjustments between our needs for solitude and companionship, intimacy and general social intercourse, anonymity and responsible participation in society, and reserve and disclosure. In the online world, examples for privacy balance can be found in e-commerce applications: users expose selected personal data, such as credit card details and our postal addresses, for the benefit of not having to leave their homes to shop goods.</p>

<p>The privacy paradox kicks in when the satisfaction of human needs, such as belonging, self-esteem and respect by others, gets involved. <a href="http://www.citeulike.org/group/2302/article/1202229">Research has shown</a> that users who claimed to protect their privacies, at the same time acted against their stated concerns by switching off privacy preserving controls or massively exposing private information. We are distorted in our decision making process. Subconsciously, we trade off long-term privacy for short-term benefits.</p>

<h2>2: Befriending Strangers</h2>
<p>In most social networks, mutual access to personal information of two users is granted if they are befriended. So far so good; friendships can be considered a proper and intuitive means to control the reach of personal information and content. However, there are three drawbacks relevant here: 
<ul><li>Firstly, in most social networks friendships are not qualifiable. That means that we are able to control future information flow only in a binary way: either there is access granted, or not. There is nothing in between to, say, private stuff can be addressed to one group of friends, and business content to another. Even if a social network lets you group, tag or define lists of friends: do you actually use there features? Most don't.</li>
<li>Secondly, trust in online systems <a href="http://www.citeulike.org/user/tnhh/article/3245205">has been shown</a> to be of lesser perceived necessity than in face-to-face encounters, encouraging people to befriend strangers as a result of disembodiment and dissociation. The problem is missing feedback functions or a reminder that future information flow is received by these strangers. 
</li><li>
Thirdly, providers exploit the second point to encourage users to befriend others. By applying principles like <a href="http://www.beingpeterkim.com/2008/07/applying-game-m.html">game mechanics</a>, they provide an easy way to collect friends and satisfy our need for self-esteem and social inclusion while at the same time paving ways for a broader future spreading of content. </li></ul>

<p>How did you like the first two of nine problem areas we described here? Did you notice the problems and pitfalls by yourself? Please share your thoughts in the comments!</p>

<p><em><small>Photo by <a href="http://www.flickr.com/photos/pong/2404940312/">rpongsaj</a></small></em></p>
                    ]]></description>
                <link>http://readwrite.com/2011/04/14/on_privacy_in_social_networks_what_drives_users</link>
                <guid>http://readwrite.com/2011/04/14/on_privacy_in_social_networks_what_drives_users</guid>
                <category>Privacy</category>
                <pubDate>Thu, 14 Apr 2011 03:00:00 -0700</pubDate>
                <author>Alexander Korth</author>
            </item>
                    <item>
                <title><![CDATA[The Trilogy of Webs for Machines: Mashing It All Together]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/web_data_apr09a.jpg" style="" />
			</span>
Almost one year ago we started a post series that presented three different webs that are all made for machines. Now it is time to connect those webs and look at examples of how they can be used. To recap, first we looked at the <a href="http://www.readwriteweb.com/archives/web_of_data_machine_accessible_information.php">Web of Data</a>, which contains open, structured data sets consisting of factual knowledge that are linked.</p>
<p>Second was the <a href="http://www.readwriteweb.com/archives/web_of_identities_making_machine-accessible_people_data.php">Web of Identities</a>, which is like the Web of Data, but for people data. Its ability protect one's privacy and to cope with data volatility differentiates it from the Web of Data. In the Web of Identities, it's people's social graphs that link one identity to another.</p>
<div class="pullquote">"The openness and availability of data, people data and services pave the way to an interoperating ecosystem... "</div>
<p>Third was the <a href="http://www.readwriteweb.com/archives/web_of_services_machine-accessible_services.php">Web of Services</a>, which makes services accessible and processable. Their semantic annotation makes them a part of this series of webs. Machines can be taught to autonomously detect, apply and replace a service, or even <em>link</em> them by chaining or orchestrating them to solve bigger problems or to achieve redundancy or scalability.</p>
<p>For the last several years, mashups have shown us that through APIs, amateur programmers and startups have the ability to access data and services and thereby create appealing new services at low cost and at a low entry barrier. Often, the interfaces are proprietary and lack a standardization so that mashup services are hardwired to data and service sources. If one puzzle piece fails, the whole service fails. Usually there are no fall-back mechanisms to automatically replace a data or service source on failure.</p>
<p>The three webs form the basis for tomorrow's mashup generation. All webs follow basic <a href="http://www.w3.org/DesignIss ues/Principles.html">Web principles</a>, such as modularization, de-centrality and simplicity, and provide accessibility and detectability. The openness and availability of data, people data and services pave the way to an interoperating ecosystem of companies serving the fragments of tomorrow's services.</p>
<p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images//trilogy_cables-20100617-120459.jpg" style="" />
			</span>
The following scenarios all utilize all three webs. Just like Richard MacManus asked "<a href="http://www.readwriteweb.com/archives/web_of_data_what_would_you_build.php">What would you build with a Web of Data?</a>" this time we ask: What would you build given all these webs? Feel free to contribute your own ideas in the comments section! Here are my app ideas.</p>
<h2>Pretty Social Recommendations</h2>
<p>Bob addresses a service that provides social recommendations, which is based on the webs. He queries <em>"recommend books about Berlin for my mother for Christmas"</em>. The service analyzes his query and splits it to a chain of subtasks, which it starts to process.</p>
<p>From the Web of Data, the service gathers general (common sense) knowledge about the terms used in the query. Like this, the service learns that a book is an purchasable item, that Berlin is a city in Germany, and so forth. The service also semantically understands <em>"books about Berlin"</em> and queries the Web of Data for books covering Berlin or authors born or living in Berlin.</p>
<p>This initial book list must be filtered using individual and contextual parameters now: Given permission from Bob, his identity provider (IDP) is called to return his mother's Web ID (a Web ID is a standardized identifier linking to the user's profile at the IDP of trust) from his social graph.</p>
<p>The mother's IDP is called to access data about the the topic fields, <em>books</em>, and, <em>Berlin</em>, she is interested in. The IDP returns a set of information the mother granted access to her family. The data contains general interests, some book purchases, reviews, comments, ratings and some attention data that was recorded observing her reading articles online. The service continues by querying the mother's closest friends' IDPs to see if one of them liked or recommends books about Berlin, since friends' recommendations are the most valuable.</p>
<p>The service now searches and calls a ranking service from the Web of Services that can handle books, personal interests and recommendations as input criteria and returns a ranked list of books.</p>
<p>In order to find the best deals for the remaining books, the service now compares and bargains prices at several book stores via the Web of Services limiting to those that guarantee a delivery before December 24.</p>
<p>Finally, the list of books is augmented with prices from different stores and then presented to Bob. Bob selects a book and pays with a checkout service from the Web of Services.</p>
<!-- <p><em><strong>Next page: </strong>Mass Customization</em></p> -->
<!--nextpage-->
<h2>Mass Customization</h2>
<p>Alice recently graduated from a university. She knows that she needs an insurance package but has no idea what it should consist of. She's heard of an intelligent insurance packaging brokerage system which she visits using her browser. She logs into the system with the Web ID she got from her IDP. From the Web of Identities, and with her permission, the system initiates a profile lookup to gather information needed for the components of the insurance package. This saves her precious time.</p>
<p>It queries for information like private address, marriage status, age and gender. Since it can't find her current income, it prompts her directly. From the Web of Data, the system now queries for her neighborhood's crime statistics for risk estimates. The system then looks up insurance services it can find on the Web of Services.</p>
<p>It configures the services with the knowledge gathered, selects the best offers and combines them to a personalized insurance package. The package consists of products from different insurers from around the globe. She signs the contracts through the broker and logs out with the satisfaction that she now is neither under- nor over-insured.</p>
<h2>Further Application Areas</h2>
<p>The webs can also be used to filter the real-time Web to individual and context-relevant content. Easy-to-use activity stream queries that are above the level of a single social platform become feasible like <em>"filter by private friends nearby"</em> or <em>"filter by business contacts living in Wellington talking about the real-time Web"</em>. How about a pinch of sentiment analysis: <em>"filter by my boss but only if he is really upset"</em> or <em>"filter by brand XY but only if the community is getting nasty"</em>.</p>
<p>Without a doubt these data and services sources can help to improve lots of existing services at low cost, including augmented reality or location-based services. Valuable knowledge can be provided for locations found on the Web of Data, friends can be displayed if they agreed to expose their location to the querying person via the Web of Identities, and so forth.</p>
<p>These are only a handful of thoughts for a whole new era of applications fueled by an open, linked and semantic basis for data and service sources. What applications can you think of? Or do you find all this creepy?</p>
                    ]]></description>
                <link>http://readwrite.com/2010/06/18/the_trilogy_of_webs_for_machines_mashing_it_all_together</link>
                <guid>http://readwrite.com/2010/06/18/the_trilogy_of_webs_for_machines_mashing_it_all_together</guid>
                <category>Semantic Web</category>
                <pubDate>Fri, 18 Jun 2010 08:30:00 -0700</pubDate>
                <author>Alexander Korth</author>
            </item>
                    <item>
                <title><![CDATA[The Web of Services: Machine-Accessible Services]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/web_data_apr09a.jpg" style="" />
			</span>

In the last two posts in this series, we discussed the <a href="http://www.readwriteweb.com/archives/web_of_data_machine_accessible_information.php">Web of data</a>, which makes structured interlinked data sets machine-accessible, and the <a href="http://www.readwriteweb.com/archives/web_of_identities_making_machine-accessible_people_data.php">Web of identities</a>, which makes data about people machine-accessible while addressing privacy and data volatility.</p>

<p>This time, we'll focus on the Web of services, which makes services accessible to and processable for machines. These Webs all have a semantic architecture in common and follow basic <a href="http://www.w3.org/DesignIssues/Principles.html">Web principles</a>, such as being decentralized, modular, simple, addressable via URIs, and built for machines.</p>
<p>The services sector has become the world's biggest business sector, accounting for 64% of the worldwide gross domestic product. The sector has pressure on it to make its services easier and more widely accessible, as well as to quickly adapt to ever faster changes in the market environment.</p>

<p>The effort to standardize such things as service-oriented architectures (SOA) and Web services has taken years, but still we have no clear definition of what constitutes a service at a conceptual level. The interface, which is the format of what goes in and out of the service, is often described formally, but what the service is actually doing, semantically speaking, is not. While there are a number of different approaches to semantically describing Web services, such as <a href="http://en.wikipedia.org/wiki/Owl-s">OWL-S</a>, <a href="http://en.wikipedia.org/wiki/WSMO">WSMO</a> and <a href="http://en.wikipedia.org/wiki/Web_Services_Semantics">WDSL-S</a>, none so far has managed to break out of its academic confines.</p>

<p>Today, there are already all kinds of services with different levels of complexity, and their number is expected to grow exponentially. The services follow different standards, and a lot of them are proprietary, uni-directional and designed to be used by humans to mash up something new. Editorial catalogs such as <a href="http://www.programmableweb.com">ProgrammableWeb</a> and search engines for Web services such as <a href="http://webservices.seekda.com">seekda</a> are designed for humans who are searching for a particular service for that reason. For tasks that are unsolvable for machines, there are even Web services such as <a href="http://www.mturk.com">Amazon's Mechanical Turk</a>, which have humans in the back end answering tricky queries.</p>

<p>The problem with all of this is that each of the tens of thousands of services is <em>accessible</em> but not <em>findable</em> by a machine without a machine-understandable description. Thus, every service nowadays has to be wired to a machine by hand. So, what would machines be capable of if services were annotated with semantic descriptions?</p>

<ul>
<li><strong>Service discovery</strong><br />
Given an index of Web services, a machine charged with finding the right service for a particular problem could choose one among those that have been indexed.</li>

<li><strong>Contracting and execution</strong><br />
Once a service has been selected, a machine could look up its terms and decide on contracting and execution details. How often would the service be needed? And what would be the cheapest contract then?</li>

<li><strong>Billing or revenue sharing</strong><br />
Depending on the autonomy of the machine, one could imagine something like an <a href="http://en.wikipedia.org/wiki/Software_agent">Autonomous Agent</a>, which automatically makes the best deal with the service provider on such things as billing or revenue sharing for service usage.</li>

<li><strong>Replacement on failure, based on experience</strong><br />
Of course, the machine would be able to replace a failing service with an equivalent one. It could also rate a service and publish it.</li>

<li><strong>Service orchestration</strong><br />
A machine could, given enough intelligence, split a task into sub-tasks and then discover, contract and orchestrate services to solve these sub-tasks. And after the sub-tasks have been addressed, the main task would be solved. Such orchestration could involve the parallelization of tasks, for speeding up or redundancy purposes, or chaining services (whereby the output of one service is inputted into the next).</li>
</ul>

<p>Research projects such as <a href="http://www.tripcom.org">TripCom</a>, <a href="http://www.ip-super.org">SUPER</a>, <a href="http://www.shape-project.eu">SHAPE</a> and <a href="http://www.soa4all.eu">SOA4All</a> are dealing with these ideas and scenarios.</p>

<p>Future scenarios are limited only by our imagination: machines could autonomously pursue goals on behalf of their master user or company, according to a specified level of freedom. These agents could solve increasingly complex problems and be granted increasingly more autonomy (finally ending up as <a href="http://en.wikipedia.org/wiki/Skynet_%28Terminator%29">Skynet</a>).</p>

<p>In the next and final post in this series, we will discuss how all of these scenarios could become a reality with the arrival of all three Webs: a revolution in the ability of machines to access, process and apply information.</p>

<p>Do you also count the Web of services as a third Web? Where do you see its limits?</p>

<p><em>(Photo by <a href="http://www.flickr.com/photos/zorro_art/2748852945/">zorro-art</a>.)</em></p>
                    ]]></description>
                <link>http://readwrite.com/2009/10/16/web_of_services_machine-accessible_services</link>
                <guid>http://readwrite.com/2009/10/16/web_of_services_machine-accessible_services</guid>
                <category>Semantic Web</category>
                <pubDate>Fri, 16 Oct 2009 05:00:39 -0700</pubDate>
                <author>Alexander Korth</author>
            </item>
                    <item>
                <title><![CDATA[The Web of Identities: Making Machine-Accessible People Data]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/web_data_apr09a.jpg" style="" />
			</span>
In a previous article, we discussed the <a href="http://www.readwriteweb.com/archives/web_of_data_machine_accessible_information.php">Web of data</a>, which is about inter-linking open data sets and, thus, turning them into machine-accessible structured data. In this post, we'll draw a picture of how the emerging social Web could serve as a Web of identities, which is essentially a people-data version of the Web of data.</p>
<p>W3C's <href="http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData">Linking Open Data</a> (LOD) project has gotten quite a bit of attention for the good job it does with the Web of data. Currently, all <a href="http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets">participating data sets</a> are accessible free of charge and can be used without constraints. The project focuses on growth for now. In an email, Chris Bizer hinted that a payment model to charge for particular content may come in future.</p>

<p>The LOD approach is very good for static and encyclopedic knowledge, but what about accessing our personal data? Technically, modeling our identity, profile data, social graph, groups, activity stream, assets, and other kinds of personal data is straightforward. But empowering machines to access this data could present challenges to the LOD approach, because it comes with all sorts of constraints and peculiarities, such as privacy and data volatility. People want control over who has access to their data or parts of their data and want to be able to block access for any reason. And issues such as rapidly changing and outdated data remain unaddressed.</p>

<p>This is where the social Web can help.</p>

<h2>The Emerging Social Web</h2>

<p><a href="http://www.flickr.com/photos/39766806@N04/3653987411/"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/machine_accessible_jul09a.jpg" style="" />
			</span>
</a></p>

<p>There was a time when we had to create a new digital identity for each social application we wanted to use. A social application provides features based on <a href="http://connollyshaun.blogspot.com/2008/05/7-key-attributes-of-social-web.html">social attributes</a>. Every application provider implemented its own proprietary ID management to authorize users to log on and implemented its own proprietary user profile system to manage information about its users. Application providers were judged by the size of their user and content base and so erected endless walled gardens to protect their properties.</p>

<p>The most significant issues people had were:</p>

<ol>
<li>Low conversion rate for user registration,</li>
<li>Users had to register for many accounts,</li>
<li>Users had to re-enter and synchronize profile data,</li>
<li>Privacy, data ownership, and inability to export.</li>
</ol>

<p><a href="http://www.flickr.com/photos/39766806@N04/3654786526/"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/machine_accessible_jul09b.jpg" style="" />
			</span>
</a></p>

<p>Not much has changed, unfortunately. Most remarkable, perhaps, is the growing number of <a href="http://en.wikipedia.org/wiki/Single_sign_on">single sign-on</a> (SSO) solutions that address the first issue for application providers and the second issue for users. New application providers can now outsource this functionality to a third-party SSO provider. Some of the biggest application providers became ID providers themselves to allow their users to log on to third-party applications with the same ID, and this has gained traction beyond these few providers. This has led us to an era of <a href="http://therealmccrea.com/2008/12/19/as-online-identity-war-breaks-out-janrain-becomes-switzerland/">identity wars</a> between the big providers.</p>

<p>Many ID providers, such as Google, Yahoo!, MySpace, and Facebook, have added the <a href="http://openid.net/">OpenID</a> SSO to their own proprietary mechanisms over time. Because of the open nature of OpenID, many third-party providers have found it easy to integrate with the bigger providers, giving them more traction because users are able to access their services so easily using their OpenID credentials. Now, these ID providers can offer read-only access to fragments of profile data that users can look up or copy to third-party applications. Like SSO and OpenID, this began with proprietary solutions, but now exchange formats and protocols are emerging whose open language allows applications to easily exchange and synchronize data. These include:</p>

<ul>
<li>API access authorization protocol <a href="http://oauth.net/">OAuth</a>,</li>
<li>Social graph exchange format <a href="http://www.foaf-project.org/">FOAF</a> ("friend of a friend"),</li>
<li>Updates exchange format <a href="http://activitystrea.ms/">activity streams</a>,</li>
<li>Address book exchange format <a href="http://portablecontacts.net/">Portable Contacts</a>.</li>
</ul>

<p><a href="http://www.flickr.com/photos/39766806@N04/3653988285/"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/machine_accessible_jul09c.jpg" style="" />
			</span>
</a></p>

<p>In the future, ID providers will loosen their connection to social applications and start taking over management of users' social attributes. Users will be able to log in to applications using credentials hosted by their ID providers of choice and grant permissions to these applications to read or even sync selected fragments of their profile data. The borders of these walled gardens will thus blur, and the social Web will become more of a weave than a patchwork quilt.</p>

<h2>The Web of Identities</h2>

<p><a href="http://www.flickr.com/photos/39766806@N04/3653988905/"><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/machine_accessible_jul09d.jpg" style="" />
			</span>
</a></p>

<p>The Web of data is a distributed web of interconnected sets of semantically annotated data. A connection is achieved as a result of data pointing to data contained in another set through a URI, just as websites point to each other with URIs. This way, machines can crawl the sets to read the data. ID providers will most likely refer to their users via URIs in the future as well. A social connection will consist of one user's URI pointing to another user's URI or ID provider. If permitted by users, a machine may very well accomplish its tasks by jumping through the Web of identities from user to user, the way it does through the Web of data.</p>

<p>Why is this needed? The Web of identities is actually a super-social graph that spans multiple ID providers. If we come across walled gardens, this infrastructure would be needed for all of the social-related search functions we perform. The following examples are thus far provided only (if at all) within individual applications:</p>

<ul>
<li>"What is the best book read by friends in my circle?"</strong><br />
This query might retrieve book purchases and book-related status updates that your friends have made accessible through their privacy settings and then rank the books in a set.</li>

<li>"Notify me if a close friend visits Berlin."</strong><br />
This permanent task repeatedly looks up your friends' geo-locations. You may also have granted your close friends access to this data, too. This task could even be combined with the Web of data to look up the meaning and location of Berlin.</li>

<li>"Sync my address book."</strong><br />
This permanent task continually synchronizes my friends' addresses and numbers with my personal address book.</li>
</ul>

<p>Now it's your turn. In what ways do you think the social Web and Web of identities are evolving?</p>

<p><em>(Diagrams by <a href="http://www.flickr.com/photos/39766806@N04/sets/72157620393913974/">alexkorth</a>)</em>
</p>
                    ]]></description>
                <link>http://readwrite.com/2009/07/11/web_of_identities_making_machine-accessible_people_data</link>
                <guid>http://readwrite.com/2009/07/11/web_of_identities_making_machine-accessible_people_data</guid>
                <category>web</category>
                <pubDate>Sat, 11 Jul 2009 07:04:57 -0700</pubDate>
                <author>Alexander Korth</author>
            </item>
                    <item>
                <title><![CDATA[The Web of Data: Creating Machine-Accessible Information]]></title>
                <description><![CDATA[
                                        <p><span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/web_data_apr09a.jpg" style="" />
			</span>
In the coming years, we will see a revolution in the ability of machines to access, process, and apply information. This revolution will emerge from three distinct areas of activity connected to the Semantic Web: the Web of Data, the Web of Services, and the Web of Identity providers. These webs aim to make semantic knowledge of data accessible, semantic services available and connectable, and semantic knowledge of individuals processable, respectively. In this post, we will look at the first of these Webs (of Data) and see how making information accessible to machines will transform how we find information.</p>
<p>The amount of information and services available is growing exponentially. Every day, it is getting harder to find the information we are actually looking for. Still, we have to learn how to tell machines what we want. Why can't a machine understand which website, recent tweet, Flickr photo, Facebook message, or restaurant we are currently looking for?</p>

<p>Because it can't. It does not understand. It has no access to most sources. It lacks the semantic understanding and common sense to build bridges between information.</p>

<p>It is critical that machines gain a new level of understanding. Instead of statistically computing how well a search term matches a document, a machine must literally be able to understand. Therefore, knowledge bases are needed to look things up. Examples of these knowledge bases include:</p>

<ul>
<li>an encyclopedia containing knowledge to look up the semantic meaning and context of a particular term (e.g. to understand that Berlin is a city, how many people live there, and where it is),</li>
<li>Yellow Pages or a service pool to query often-changing and more complex information (e.g. a route from Berlin to Porto by car, or the current temperature of Porto in Celsius),</li>
<li>a people database to look up profile information, with user permissions, which could improve personalization and recommendations.</li>
</ul>

<h2>The Web of Data</h2>

<p>The idea of the Web of Data originated with the Semantic Web. People tried to solve the problem of the inherent inability of machines to understand web pages. Initially, the aim of the Semantic Web was to invisibly annotate web pages with a set of meta-attributes and categories to enable machines to interpret text and put it in some kind of context. This approach did not succeed because the annotation was too complicated for humans who had no technical background. Similar approaches, like <a href="http://microformats.org/">microformats</a>, simplify the markup process and thus help bootstrap this chicken-egg problem.<p>

<p>These approaches have in common the effort to improve the machine-accessibility of knowledge on web pages that were designed to be consumed by humans. Furthermore, these sites contain a lot of information that is irrelevant to machines and that needs to be filtered. What is needed is a knowledge base for machines to look up "noiseless" information. But wait! Who said that machines and us humans need to share one web anyway?</p>

<p>The idea of the Web of Data came about as a result of both this limitation and the existence of countless structured data sets distributed all over the world and containing all kinds of information. These data sets are the property of companies that trend to make them accessible. Typically, a data set contains knowledge about a particular domain, like books, music, encyclopedic data, companies, you name it. If these data sets were interconnected (i.e. link to each other like websites), a machine could traverse this independent web of noiseless, structured information to gather semantic knowledge of arbitrary entities and domains. The result would be a massive, freely accessible knowledge base forming the foundation of a new generation of applications and services.</p>

<h2>Linking Open Data</h2>

<span class="embedded-Media-image img-caption-c">
				<img src="http://readwrite.com/files/files/files/images/web_data_apr09b.png" style="" />
			</span>


<p>One promising approach is W3C's <a href="http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData">Linking Open Data</a> (LOD) project. The above image illustrates <a href="http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets">participating data sets</a>. The data sets themselves are set up to re-use existing ontologies such as <a href="http://www.w3.org/TR/wordnet-rdf/">WordNet</a>, <a href="http://www.foaf-project.org/">FOAF</a>, and <a href="http://www.w3.org/TR/skos-reference/">SKOS</a> and interconnect them.</p>

<p>The data sets all grant access to their knowledge bases and link to items of other data sets. The project follows basic design principles of the World Wide Web: simplicity, tolerance, modular design, and decentralization. The LOD project currently counts more than 2 billion RDF triples, which is a lot of knowledge. (A triple is a piece of information that consists of a subject, predicate, and object to express a particular subject's property or relationship to another subject.) Also, the number of participating data sets is rapidly growing. The data sets currently can be accessed in heterogeneous ways; for example, through a semantic web browser or by being crawled by a semantic search engine.</p>

<p>To get a feeling of how this machine Web of Data feels like, you may want to look up:</p>

<ul>
<li>the <a href="http://cb.semsol.org/company/yahoo">company Yahoo!</a> on <a href="http://crunchbase.com/">CrunchBase</a>,</li>
<li><a href="http://dbpedia.org/page/Berlin">the city of Berlin</a> or <a href="http://dbpedia.org/page/Tetris">the game Tetris</a> on <a href="http://dbpedia.org/">DBpedia</a>,</li>
<li>the book <a href="http://opmi.labs.oreilly.com/product/9780596521677"><em>iPhone: The Missing Manual</em></a> on <a href="http://oreilly.com/">O'Reilly Media</a>.</li>
</ul>

<p>With every fact available on the Web of Data, more general and specific knowledge is made accessible to machines that will enable a whole new generation of services to be created. Highly sophisticated queries become machine-processable and accessible to the next generation of, say, search services.</p>

<p>Check out Tim Berners-Lee's <a href="http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html">talk at TED about the Web of Data</a>. How do you think about it? Do you encounter the same issues being overloaded by information or too much noise?</p>

<p><em>(Photo by <a href="http://www.flickr.com/photos/zorro_art/2748852945/">zorro-art</a>. Graph by the <a href="http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData">Linking Open Data project</a>.)</em></p>
                    ]]></description>
                <link>http://readwrite.com/2009/04/18/web_of_data_machine_accessible_information</link>
                <guid>http://readwrite.com/2009/04/18/web_of_data_machine_accessible_information</guid>
                <category>Semantic Web</category>
                <pubDate>Sat, 18 Apr 2009 03:00:00 -0700</pubDate>
                <author>Alexander Korth</author>
            </item>
            </channel>
</rss>

