SimpleCrawler for your everyday web crawling needs

Over at the standards-schmandards blog I often test websites to gather statistics on specific HTML use, accessibility and other things. Each time I have written a web crawler to collect the data. In Python and Ruby this is a simple task but last time it was like a déjà vu and I decided to create a Ruby library that I could use in the future.

SimpleCrawler is a Ruby gem that covers basic web crawling needs. Coupled with Raakt and Ruport it can be used to create a basic accessibility report for a website.

A minimal example:

For more details and examples see the SimpleCrawler wiki.

Comments

  1. Andrei says at 2007-08-27 22:08:

    Wiki doesn’t seem to be up yet, i get a ‘Permission Denied’ message every time.

  2. Pete says at 2007-08-27 23:08:

    Andrei: Thank you. Forgot to set the wiki permissions. It should be available now.

Leave a comment

OpenID

Anonymous

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>