Home Chasing Real-Time Raindrops in an Ocean of Content

Chasing Real-Time Raindrops in an Ocean of Content

The Web is huge. And growing. Faster everyday. It’s almost like an ocean where there’s no evaporation (the data on the Web stays there virtually forever), but yet, it’s always raining in it. The rain is the new content that’s added into the ocean.

Every tweet is a drop, every blog post is a drop, every check-in is a drop that falls into the ocean. This ocean is almost constantly under a tropical storm in some places, like Twitter or Facebook.

Guest author Julien Genestoux is the founder and CEO of Superfeedr, a company dedicated at making RSS and Atom feeds realtime. It has implemented PubSubHubbub from day one and now host several hubs, including ReadWriteWeb, Tumblr, Posterous and Gawker. Follow Julien on Twitter.

When you’re a search engine, you obviously have an exhaustivity requirement. You can’t really skip on indexing the Indian Ocean. Google sends its bo(a)ts all over the ocean where it’s raining to update its index. However, the ocean is growing so fast that it will eventually become harder and harder to stay exhaustive.

Unfortunately, not only the ocean is growing, but it’s also raining more, which means that if a bo(a)t is away from a zone for too long, when it will be back it will have changed tremendously. That’s what happens when you see results in a search engine that are 1- or 2-years old, or even older. They’re not wrong, they’re just often inaccurate, but rank well.

It’s a real technical problem for search engines to know where to send their bo(a)ts, and at the right time! And when Google says they’re going to feed their search index with PubSubHubbub data, that’s what they’re trying to do: save a little bit on the boats.

I strongly disagree with John Battelle when he says this is not a huge deal. My take is that he sees this only as a great technical and infrastructure opportunity for Google, not so much as an immediate benefit for the end user. I strongly disagree – and so do you. You disagreed when you typed “earthquake” into Twitter Search, or even “hudson crash”, or “Mickael Jackson”. At that point, you knew that Google wasn’t able to provide you with the information you were looking for, and this is a massive loss for Google.

Google will have a hard time getting this brain share back. The first thing it needs to do is to actually have results that date back from the minute when people look for these things.

You may argue that if you search 10 times a day on Google, you go maybe once a week to Twitter search. I’m the same, no worries. Yet, I know that Twitter is much better than Google at contextualization. When I do a search on Google, I expect to find the absolute truth. If I look for earthquake, I’m looking at facts about earthquakes: pictures or maybe historical data. If I look for earthquake on Twitter, I’m looking for context; I want what is being said about earthquakes now (and here!).

As a matter of facts, Google always had a lot of issues about context because they know so little about the people who search there (or maybe they know a lot, but don’t want to scare us). Adding PubSubHubbub is a way for them to be able to take the “time dimension” back. They many never have the conversations that Twitter has, but they will have a much bigger ocean of data than Twitter’s sea of Tweets

Photo by Pam Roth.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.