"I Tried YQL Execute and All I Got Was an Authenticated Javascript API Processing Layer in the Cloud"

There's a great amount of data available on the Web in APIs or even straight HTML. It's all there for the parsing - and parsed data from social media in particular is held to be a goldmine. But traditionally, it's the heavy lifting (the broad variety of programming languages used in APIs, the challenges presented by complicated authentications, the occasional need for massive pipes) that has made accessing and sorting data into useful applications a laborious process.

Yahoo!, chiefly to serve the needs of its own engineers, has been developing a sophisticated solution that is agnostic across all Internet platforms and that lowers both the burden of labor and the barriers to entry for social and other web application developers, many of whom are already singing the praises of the newly released YQL Execute.

"It adds a lot of power," said Mike Cannon-Brooks, co-founder of Atlassian, an Australian collaboration and development software company widely recognized as one of the biggest stars in the Enterprise 2.0 world.

"YQL Execute allows you to build tables of data from other sources online, using Javascript as a programming language and run it on Yahoo's servers, so the infrastructure needs are very small."

In the slightly more technical language presented on the Yahoo! Developer Network Blog, "The Execute element can contain arbitrary developer code that the YQL data engine runs during the processing of a YQL statement."

It also handles authentication for third party sites.

Is there anything like it currently on the market?

"Nothing... It's pretty awesome," said Cannon-Brooks.

Yahoo! Query Language

According to Yahoo! Chief Technologist Sam Pullara, the idea behind YQL (launched in October 2008) was to create an agnostic query language similar to SQL, a language familiar to most developers, and let developers use that language to use the Internet as a huge database. "If you make it universally and simply accessible so every application developer doesn't have to learn every API, it's be easier for developers to create apps from the data users have taken so much time to make available on the Internet."

Although YQL looks a lot like SQL, it treats the info on the web as a virtual table that developers can manipulate in a standardized way, regardless of the API that data came from. Developers only had to know how to use YQL to quickly create simple mashups.

Open Data Tables

Then, this February, Yahoo! launched open data tables. "Initially," said Pullara, "we had a lot of default tables in the system, mostly Yahoo! API, things like Flickr, local search, Yahoo! weather. For accessing the rest of the Internet, we created dynamic tables that understood things like XML, Atom, RSS, comma-separated value tables such as spreadsheets, etc. Dynamic tables let you access them but not abstract them. Open data tables let you map a 3rd party site, making the data accessible with YQL."

YQL was used to support a broad range of APIs, almost anything publicly available online, from FriendFeed and Google Reader to the Guardian newspaper. "No one has yet pointed out an API they can't figure out how to map," said Pullara.

However, some data could not be accessed without authentication, such as Google Calendar or Netflix. Those APIs were very often very sophisticated and even complicated for the end developer. For these APIs, Yahoo! rolled out YQL Execute on April 29.

YQL Execute

"With Execute," said Pullara, "the code only needs to be written once, and not necessarily by the app developer. The authentication is all covered by the Yahoo cloud."

YQL Execute also allows developers to access multiple services and get a single result back. For example, an app developer could call up New York Times articles with specific tags AND Flickr photos with related tags; YQL Execute would return a combined result with both articles and related photographs. Another benefit for developers is the use of the massive Yahoo! infrastructure, as all the heavy lifting of data is done on Yahoo! servers.

And because of the speed, simplicity, and scope of these tools, implications now range much farther than simple mashups. With access to authenticated and private data, more sophisticated applications can be written quickly and easily.

The Dark Side

"The fact is this: If you do not patent, if you do not copyright, if you do not privatize, and if you do not own, you will be ripped off by someone; and you asked for it."

The above quotation is from Scott D. Reinhart, who has been eyeball-deep in application development longer than many "social media gurus" have been out of high school.

Right alongside the generally held social media dictum that a rich data stream is inherently bankable, there is the hotly debated issue of data ownership. Especially when data is made more valuable by having been parsed, organized, and compared, and most especially when someone creates a revenue stream from previously unmonetizable data, questions of ownership and copyright flare up around the social web.

"Public APIs allow you to easily develop using mature platforms," said Reinhart, "but they [large IT and social media companies] usually have a hidden intention. In this case they advocate putting your database layer onto their systems... So let's say I use the Yahoo! data layer, I use BizSpark to get my development tools, and I am making MySpace (Open Social) and Facebook apps using jQuery - who owns my code? Technically, they own everything. They can claim I just made a mashup.

"I would, as someone approaching these systems, stop drinking the Kool-Aid and read the terms of use. Check what it says about ownership."

Yahoo! Servers for YQL Developers

However, Pullara said of Yahoo!'s claim to developers' IP, "We don't own anything.

"If you create an open data table, there's no requirement to upload it to Yahoo! We do cache data that we pull from APIs and the web to make it faster, but we don't store that data. It passes through without being collected for permanent storage."

By contrast, with other services such as Google or Amazon Web Services, developers are required to upload their data, which is stored and executed on the company's systems. In using Yahoo! YQL, a developer's data has "a very transient experience and expires from the cache," said Pullara. "It's a convenience, not a requirement in any way."

The Price of Free

Yahoo! has begun investigating potential commercialization of YQL technologies.

"We want to enable rather than discourage more useage ," said Pullara. "And while people don't want to pay, they do want to know they're a customer and have a relationship with Yahoo!"

Currently, Yahoo! has set certain limits on use of their infrastructure. App developers are limited to 100,000 calls per day, per IP address. If the application runs in a browser (hence, on many different IPs), it's a non-issue. Pullara said, "The limit targets those who would abuse the platform... people who might spin up DoS attacks. You have to have controls in place to make sure that doesn't happen."

Many developers are enthusiastic about the legitimate and value-adding implementations of the technologies. "The YQL improvements are just sex on legs," said Cannon-Brooks via Twitter. "The most exciting, least talked about 'tech of now' is YQL."