H2G2 RSS feed

-
Inspired by a combination of Danny O'Briens Life Hacks talk and Yoz and Sean's Hitchhikers Guide To The Galaxy talk (both at NotCon) I have knocked up a quick webscrape RSS feed for the Official H2G2 Movie blog.
Here's what you need...
First of all I'm assuming your OS has curl (mine's OSX)
  • Step one, get the HTML and XMLify it
    curl http://hitchhikers.movies.go.com/hitchblog/blog.htm | tidy -asxml --indent yes --doctype strict --output-encoding latin1 --force-output yes > h2g2_blog.html
  • Step two, transform the XML to RSS
    xsltproc --novalid -o h2g2_blog.xml h2g2_blog.xslt h2g2_blog.html
    The xslt can be found here.
  • Step three, point your aggregator at your RSS feed. Done.

This method fairly obviously requires the existence of curl, tidy and xsltproc on your OS. Usual disclaimers apply about the xslt being pretty fragile.


About this Entry

This page contains a single entry by James published on June 8, 2004 9:26 PM.

Ansible 203, June 2004 was the previous entry in this blog.

Close Encounters of the Fourth Kind is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Possibly of Interest