Mar 13
I used to cringe at having to work with XML. These days there are nice ways to work with it… from E4X to Groovy builders, and of course with Hpricot.
I wanted to take my OPML file and grep out the URLs so I could create a custom search engine that would search over my buddies (from the OPML file).
It is basically a one-liner with Hpricot:
require 'rubygems' require 'hpricot' filename = ARGV.first || 'mysubscriptions.opml' doc = open(filename) { |f| Hpricot(f) } (doc/"outline[@htmlurl]").each do |url| puts url.attributes['htmlurl'] end
In my case the OPML file is just sitting on disk there, but I could easily have it grab the file from a URL:
require 'open-uri' doc = Hpricot(open("http://almaer.com/mysubs.opml"))
Not bad until we implement JsDOM ;)
Oh, and here is my nice custom search engine:
September 7th, 2009 at 4:46 pm
Great article. Here’s a related one that uses hpricot and open-uri to auto-populate title and description information for links:
http://doblock.com/articles/using-hpricot-to-auto-populate-link-information