Wednesday, April 21, 2010

Quick hacks: webnumbr for fast scraping (of a single number)

Like the previous quick hack, I had little to do with this -- just demonstrating a tool. In this case, the goal was to graph the national average for certificate of deposit interest rates over time (you can find this information at Bankrate.com). At SHDH37, Drew told me about webnumbr which does exactly this (and was possibly developed at a previous SHDH) -- you pick a webpage and then select a single element from that page. webnumbr will scrape the webpage at an interval of your choosing and graph the data. I'm not wild about their graphing (prefer something interactive like this or Google Charts) but it looks like you can get the raw data in various formats (CSV, etc.) which would let you use other graphing methods, overlay charts, etc.

If you're interested, here are graphs for the average 1-year and 2-year CD rates.

Monday, April 19, 2010

Quick hacks: Yahoo Pipes for RSS feed filtering

In my Copious Free Time(tm), I've been working on some very quick projects. I'm writing them up since, while many of them are quite simple, they highlight some interesting tools.

I used to read the "Marmaduke Explained" blog at this website. It has since moved here but the RSS feed includes other entries as well. Using Yahoo Pipes, I was quickly able to create a new RSS feed which only includes entries that include Marmaduke in the title. See the pipe I made here (output available as RSS, JSON, email, etc.). Yahoo Pipes is, like the name implies, a series of tubes -- nodes in the graph feed perform operations and feed into other nodes. So, my point is that Yahoo Pipes is a cool piece of software and probably underutilized -- the basic filtering that I do is just the tip of the proverbial iceberg for what it can do (for example, there are "translate" and "location extractor" nodes). If they added basic scraping abilities, I'd likely use it even more (but see upcoming posts for different ways to do scraping...).