When it comes to news-reading apps, iPad app Zite is a favorite amongst many of the staffers at ReadWriteWeb. It provides a personalized news feed based on your interests, social graph and the community. Zite will bring you news catered to your interests but also provide serendipitous discovery of new sources and topics that may be of interest. This is all useful and interesting functionality … but how the heck does it work?
Zite (now owned by CNN), at its core, is a data-parsing engine tied to the social graph. Its roots are buried to a social discovery search engine called Worio that the team eventually folded to create Zite. Article URLs are parsed out of the social graph, mapped and weighted. How does the company pull this all off? Today, Zite gives users a peek under the hood.
Worio’s algorithm directly informs how Zite works, co-founder Mike Klaas told ReadWriteWeb.
“With Worio we were trying to bring contextual discovery to keyword search. For instance, if you searched for a restaurant in your neighborhood, Worio would recommend other restaurants in the area, or a foodie blog for your city. The basis of this technology was understanding the user’s wider interests-something that translated almost directly into the core personalization algorithm of Zite,” Klaas said.
In a blog post, Zite outlines how it curates content to your interests. There are five main points that revolve around the Three M’s: “mining, modeling and matching.” Zite mines content from your social Web, models that content along with the community and your particular interests. It then matches your interests to the content and the community to inform what it shows you in the app.
Let’s break that down a little bit further. Think of Zite as an assembly line serving your URLs from your social network. There are steps an URL must go through before it appears in your reader.
Zite finds what is interesting by monitoring URLs shared through whatever social networks you decide to hook up to the app such as Twitter and Delicious. It then throws out spam (because there is ALWAYS spam) and associates each URL with the user that shares them. Zite will then calculate the credibility of that user, like assigning a weight to a variable. More popular users with original content that is often shared will have greater weights. Zite will then queue that content is something to potentially show the user.
The URL then moves down the assembly line. Zite takes these “vetted” URLs and strips out all extraneous, non-readable material. That includes HTML formatting, scripting codes etc. The text of the document is then analyzed via text mining and term extraction to capture what the content is about. It parses names, dates, places and other topical information. Reading an article about Google CEO Larry Page? Zite will be able to know that and give you an option to get more news about Page. Same with authors and reporters. Zite will strip the metadata of the post. Want to read more from ReadWriteWeb’s Dan Rowinski? Zite can show you more of my articles.
Zite then looks to model the community. Think of it like recommendations from Amazon or Netflix. The same heuristic model applies. Relationships are correlated between users and documents based on what the app has captured from the social Web. The creates a map of document-to-user relationships. It then condenses that information to match your interests later.
Zite takes your preferences into account as well. If you were reading this article in the app, you can respond to whether you enjoy reading it or not with a thumbs up or a thumbs down. It will likely parse categories for you like “apps,” “iPad,” or “Zite.” It will ask if you want more content from the author and publication. This is what Zite calls “modeling you.” You are the end of the conveyor belt that chooses what and how you consume said content. Zite will also throw out some stories that may not be in your direct preferences as a way for variety and discovery.
The end of the process is delivery. The app knows what you read and what you do not, makes comparison with high-scored documents and your interests and matches it to your topics. Age is factored in to the weight of a story with the score going down the older the story is. Finally, it is shipped to your iPad for consumption.
Zite has a fascinating process. I keep thinking of it in terms of the first Austin Powers movie when he is woken from being cryogenically frozen and is shipped down the line to re-acclimate himself to the real world. Instead of a overly-hairy Mike Myers, Zite casts a large net to capture URLs and redress them for your iPad consumption.