Changing the world one screen scraper at a time

Several of us from Rubyred attended Mashup Camp last week (in fact, we were the official grapefruit sponsor of the event), and we were fortunate to meet some of the real pioneers of mashdom, all showing off what can be done with an API or two, some regular expressions and a bit of vision. People like Paul Rademacher from Housingmaps, Adrian Holovaty of Chicagocrime.org, Taylor McKnight of PodBop, and Bartosz Solowiej/Frank Harris from Traincheck.

The most exciting thing for me about Mashup Camp was seeing clearly the contours of an emergent phenomenon now in its earliest stages. We have APIs for only a miniscule portion of the data providers out there, and this is unlikely to change anytime soon. But we are starting to see a new breed of home-brewed APIs built on top of the screen scrapers we’ve all been writing and maintaining for years—scrapers that pull crime stats from police blotters, address data from Craigslist apartment listings, mp3s from web sites.

It’s one thing to build a scraper for your own app, it’s another to provide open access to it, enabling other developers to call its methods as if it were an open API from the data provider itself. For instance, Ontok is providing just such an interface for grabbing data from Wikipedia.

I proposed at the conference that we call this new incarnation “scrAPIs”, not realizing that the term had been coined back in 2002 by Paul Bausch. Paul was seeing the need for this back before Amazon opened up its API, and is one of the Godfathers of the modern mashup.

The concept of the scrAPI is potentially huge. Rather than waiting years for data providers to build their own APIs, we can build them today by leveraging and sharing the work we’ve already done on scrapers. Given the intellectual property issues, there are some tricks to doing this on the right side of the law. Consolidating scrAPI efforts as collaborative projects has huge benefit (it’s open source!), but how can we structure this particular kind of effort? I have some ideas about how to get things rolling, and I know others do as well. For starters, I’ll be posting a piece every day this week, each exploring an aspect of the scrAPI.

Coming up next: What is a scrAPI, and why we need them?

Rubyred at Mashup Camp
Rubyred brings the grapefruit

2 Responses to “Changing the world one screen scraper at a time”

  1. toszter Says:

    Way to go Thor! I look forward to learning and contributing!!

    B

  2. co.mments.com » Blog Archive » Scrapi Says:

    [...] 7;s all a matter of adding more features to the existing scrAPI. And more on scrAPIs, from Thor Muller of [...]

In defense of irrational exuberance