Home The Glory, Bliss and How-to of Screen Scraping for RSS

The Glory, Bliss and How-to of Screen Scraping for RSS

Wired has an awesome top story today on the world of startups utilizing scraped data from big companies to offer new layers of value for their own users. It’s a roughly objective piece that I highly recommend reading but it was also inspiration for me to finally record a screencast on the subject (see below).

I love RSS, probably more than anything on the web. If you’re not familiar with the concept, see my very old definition of RSS and my almost-as-old post on teaching people about RSS.

Not every page on the web publishes an RSS feed, though. Thus the need for these wonderful screen scraping tools. I’ve written about a variety of tools you can use to create a feed for a site or page that doesn’t have one. Sometimes, though, you’ve got to pull out the big guns. In those cases, it’s time for Dapper.

Dapper is a company founded in Israel, now venture backed and was named in the aforementioned Wired article. It is the sweetness.

Dapper will let you pull data from almost any web page and get it in a wide variety of outputs, including RSS, email, iCal, a Google Gadget, CSV and Google Maps. Is that incredible or what?

Let’s let the video do that talking. I have an awful cold (it’s almost better, Mom!) so please excuse the very rough voice. I made the following screencast using JingProject, setting up an RSS feed of search results in Del.icio.us for articles tagged from ReadWriteWeb.

Clicking on the image below will open up another window so you can view the 4 minute video full screen.

If you’re as excited about Dapper as I am, you should check out DapperCamp, a two day free conference all about Dapper coming up in early February in San Francisco. IBM and Mindtouch are sponsoring the event and Mitch Kapor is keynoting it. It looks like it’s going to be a lot of fun.

Take that, Wired Mag ambivalence! Really, though, you should read that Wired article – it’s a good one that discusses some issues that are going to be very big once more people figure out how exciting data portability is.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.