All posts in Ruby

Rails vs Grails vs Django models

Coming back to Rails after being away from some time in Django land I discovered a huge difference in how Rails, Grails and Django treats your models. Read more…

Intricacies of PHP compared to Ruby

Via Tim Bray’s blog I found zestyping’s “Why PHP should never be taught”. In it he provides some interesting PHP code that will be difficult for beginners to understand. Read more…

SimpleCrawler for your everyday web crawling needs

Over at the standards-schmandards blog I often test websites to gather statistics on specific HTML use, accessibility and other things. Each time I have written a web crawler to collect the data. In Python and Ruby this is a simple task but last time it was like a déjà vu and I decided to create a Ruby library that I could use in the future. Read more…

Hackety Hack – The Foundation for a Revolution

Why the lucky stiff is a well known name among most Ruby developers. Many have read his Ruby programming tutorials and seen his spectacular performances (or whatever they are) at RailsConf and elsewhere. Personally, I owe him a lot for Hpricot, the liberal HTML parser (at my government agency we use it to run the quarterly test of all public websites in Sweden). Hpricot is also the default parser for the Ruby Accessibility Analysis Kit. Read more…

Checking for Model Classes Before Using Them in Rails Migrations

If you are using model objects in migrations (e.g. for inserting data) you should make sure that the migration works even if that model class is removed. I discovered this when setting up a new development environment and running all migrations in an empty database.

Let’s say you have the following migration code:

class InsertCounties < ActiveRecord::Migration
def self.up
County.create :code => ’10′, :name => ‘Blekinge’
County.create :code => ’20′, :name => ‘Dalarna’

This assumes that the County model is available when the migration is run. If you checked out the most recent version of your code from svn it is possible that it doesn’t contain the County model and the migration will fail. To check if County is available before trying to use it we can use Ruby’s defined? like this:

Update: No we can’t. defined? always returns false inside an ActiveRecord migration for some reason (maybe the class isn’t loaded before the actual call?). We have to use a begin…rescue…end block instead:

class InsertCounties < ActiveRecord::Migration
def self.up
County.create :code => ’10′, :name => ‘Blekinge’
County.create :code => ’20′, :name => ‘Dalarna’
puts “Could not add data…”

This way the rest of the migrations will run.

Bringing Ruby to the .NET environment

Things are heating up in the Ruby-as-a-dotnet-language area. Martin Fowler voiced his concerns on Microsoft not being able to look at source code and therefore having trouble implementing Ruby properly. Microsoft, with John Lam in the cockpit, is implementting Ruby for the .net platform (if you have been reading my previous blog posts I predicted way back in february 2006 that John Lam would get scooped up my Microsoft:-).

Ola Bini is also concerned about Microsoft not letting ther developers look at the Ruby implementation. If you remember the whole SCO debacle I guess it isn’t that strange. Microsoft is in the position where software they develop potentially may end up in millions of computers. Apparently the US legal system awards damages in proportion to this. Thus, any issues with a Ruby implementation on .net can quickly become costly.

It is all quite bizarre. Does this mean that the Microsoft version of the Ruby language is different from the “original” Ruby? I guess we will never know. Developers will probably write a lot of Ruby code that runs happily on the CLR. Rails applications will be deployed. But I am sure that there will be “special cases” where IronRuby will differ from “original” Ruby.

Therefore is was refreshing to see that Queensland University of Technology are progressing steadily with their Ruby.NET implementation. Currently you can actually compile a Ruby script into a .NET 2.0 assembly that other CLR languages can talk to. This may be the spearhead into the other half of enterprise deployment options.

All in all the future of software development looks bright. Will developers that invested a lot of time in Java or C# switch? Or will they move on to maintaining applications?

Working with UTF-8 in PDF::Writer and Ruby on Rails

Googling for information on how to use PDF::Writer shows that there are many european developers frustrated with the lack of UTF-8 support in PDF::Writer. As Ruby on Rails works great with UTF-8 these days this can be a bit of an issue.

Part of the problem lies in the fact that the PDF specification (at least up to 1.6) does not support UTF-8 (you can use UTF-16 if you like). I had the misfortune of plowing thorugh it a couple of years ago when developing a PDF form filler library for a customer (don’t ask).

In Ruby on Rails, this is easy to solve as long as you only use Latin characters with diacritics. The solution is to switch encoding back to ISO-8859-15 for text strings you feed to PDF::Writer.

A simple extension to the String class will do the trick:

class String
  require 'iconv'
  def to_iso
    c ='ISO-8859-15','UTF-8')

If you are working in Rails you can put this code in the lib folder (I usually call the file string_extensions.rb).

Then, when you call the text method on your PDF::Writer intance you can easily pass a correctly encoded string.

Overriding PDF::Writer text method

A much cleaner approach, as Aníbal describes in the comment below, is to override PDF::Writer’s text method.

Put the following code in a file called pdfwriter_extensions.rb (or whatever you choose to call it) in your lib directory:

CONVERTER = 'ISO-8859-15//IGNORE//TRANSLIT', 'utf-8')

module PDF
	class Writer
		alias_method :old_text, :text

		def text(textto, options = {})
			old_text(CONVERTER.iconv(textto), options)


In your controller that handles the PDF output you add:

  require 'pdf/writer'
  require 'pdfwriter_extensions'

…after which you can use PDF::Writer like in the tutorial:

    pdf =
    pdf.select_font "Helvetica", :encoding => nil 
    pdf.text "User name: <b>#{}</b>", :font_size => 16, :justification => :left
    send_data pdf.render, :disposition => 'inline', :filename => "user_details.pdf", :type => "application/pdf"

New release of the Ruby Accessibility Analysis Kit and online interface

The current version has some minor bug fixes that will speed up testing. The online test interface has been updated to support direct input of markup. This is for those of you unable to install Raakt locally.

This means that there is no reason to skip basic accessibility testing of whatever you are developing! To find out more on how you can integrate Raakt in your testing framework check out the Raakt wiki which now has a lot more information.

Putting Camping in the Camping logo

The camping framework embedded in the campin logoHaving played with the Digital Invisible Ink Toolkit (DIIT) lately it was interesting to see how big the logo file for the camping framework would become if the camping framework itself was embedded in it. The original logo file (stolen from Why’s site) is a 73 Kb PNG file. Embedding camping.rb in it creates a 101 Kb file visible to the right.
Read more…

A new version of the Ruby Accessibility Analysis Kit

This is to announce that RAAKT (The Ruby Accessibility Analysis Kit) has been updated. This release includes more accessibility tests and an initial mapping of tests to the Unified Web Evaluation Methodology (UWEM). Also, thanks to Derek Perrault RAAKT now uses Hpricot to parse the HTML document. This solves the problem where the previous parser (RubyfulSoup) declared a class “Tag” that was likely to clash with your local classes in Rails.

To install the new version simply type gem update raakt or gem install raakt if you have a previous version installed.


Summary of changes from version 0.4 to version 0.5.1.

  • Example of how to use RAAKT in Watir unit tests.
  • Tests for area element alt attribute.
  • UWEM mapped in comments for relevant test methods.
  • Test to check that input fields of type image have an alt attribute with text.
  • Refactoring of some methods for more compact syntax. Patch by Derek Perrault.
  • Added test to verify that fieldsets have legends.
  • Fixed alt_to_text that needed to check element type before attempting to read attribute value.
  • Fixed language attribute check (downcased value). Added iso language code list.
  • Applied patch from Derek Perrault (better use of Hpricot features).
  • Fixed check for lang attribute (now requires a value as well).
  • Test for charset mismatch in http headers and document meta element.
  • Switch to Hpricot. Patch by Derek Perrault.

An article on the value of, and how to integrate basic accessibility tests in your development process is in the works for In the meantime check out the Raakt wiki.

If you are using Watir it is very simple:

require ‘watir’
require ‘raakt’
require ‘test/unit’

class TC_myTest < Test::Unit::TestCase
attr_accessor :ie

def setup
@ie = Watir::IE.start(“”)

def test_startPagePassesBasicAccessibilityCheck
#set up the accessibility test and pass html to raakt
raakttest =

#run all tests on the current page
result = raakttest.all

#make sure raakt didn’t return any error messages
assert(result.length == 0, result)

Parsing ASP.NET sites with WWW::Mechanize and Hpricot

Users of Hpricot (which WWW::Mechanize is using as the default html parser) may have discovered that the buffer size for attribute values is set to 16384 bytes default. Typically this isn’t a problem, I mean who would put 16Kb of data into an HTML attribute? Well, ASP.NET uses a hidden input field to store view state in order to save a few clock cycles on the server side (and spare developers the hazzle of coding view state).

Typically, developers tend to forget to turn off view state resulting in a lot of data that never is used. The guy who made the decision to have this default view state behaviour has probably caused a lot of unnecessary bytes clogging your internet connection (as it typically is included in each request).

If you are using mechanize and/or Hpricot to parse such a site you may have come across this error:

ran out of buffer space on element <input>, starting on line 38. (Hpricot::ParseError)

If you want to try it out, load this sample viewstate file into Hpricot. The buffer space error has been reported in the Hpricot issue tracker.

Fortunately, from version 0.5 of Hpricot it is easy to increase the buffer size before loading data. This is done by setting the buffer_size attribute to a sufficiently large number:

require ‘hpricot’
Hpricot.buffer_size = 262144

Fixing Mechanize

As mechanize uses Hpricot as the default parser this error will happen when loading many ASP.NET pages. Fortunately, mechanize allows the user to specify a custom parser class through the pluggable_parser attribute. To make mechanize use Hpricot with a larger buffer size:

require ‘hpricot’
require ‘mechanize’

Hpricot.buffer_size = 262144
agent =
agent.pluggable_parser.default = Hpricot

…and we’re back on track mechanizing the world again.

Hpricot – My New Favourite Ruby XML Parser

One of the missing features in the default Ruby distribution is the lack of a good XML parser. The included REXML is only sufficient for the most basic scenarios as performance degrades quickly with XML size.

Recently I had a situation where I needed to parse a 700 Kb XML file and extract some values with XPath queries. Doing this in REXML proved to be too slow (around 30 seconds). Since I was on OS X it was a small task to get the Ruby libxml bindings. The speed increase was immense and everything worked smoothly.

As usual, requirements change and the application needed to be able to run on Windows and OS X. Unfortunately the Ruby libxml api does not work in Windows. Looking around, I couldn’t find a decent XML parser for Ruby that worked on both platforms and I didn’t want to code for both REXML and libxml.

Enter Hpricot. Originally written to do HTML scraping it is actually very capable of working with XML too. And it has binaries for Windows, Linux and OS X.

A quick example shows how easy it is to load and get data from an XML file:

require ‘hpricot’
doc = Hpricot(open(“lazaridis_msgs.xml”))“//message”).each do |message|
e_number = message.attributes["subject"][16..17]
puts “Evaluation identifier is #{e_number}”

Technically, Hpricot isn’t an XML-parser. It doesn’t validate the document which means that malformed XML can slip through. You will have to be careful if your application relies on wellformedness of the XML data.

I will be switching the Ruby Accessibility Analysis Kit over to Hpricot soon. It will be a nice speed increase for your Rails unit tests using RAAKT. This will also solve the problem in RubyfulSoup where the author declared a “Tag” class with a bad scope.

So, maybe I am the last person on earth to discover this, but if you need a great library for XML parsing on multiple platforms, check out Hpricot.

Using the Apple remote in Ruby

Apple infrared remote controlAfter playing with iremoted and Ruby’s IO.popen I guess I am convinced that Ruby really works as a glue on many levels.

So, using iremoted and capture Apple remote commands in the terminal it is trivial to use the remote to control a Ruby application. Of course, calling OS commands isn’t limited to Ruby.

Here is a minimal dungeon game which you control with the apple remote. If you ever manage to find your way out I would be surprised…

IO.popen (“/Applications/iremoted”) { |f|

x = y = 0

puts “You are standing in an open field west of a white house, with a boarded front door…”

while (line = f.gets.chomp)
if line == “Plus released”
y += 1

if line == “Minus released”
y -= 1

if line == “Previous released”
x -= 1

if line == “Next released”
x += 1

puts “You are at coordinate #{x}, #{y}.”

Who will be first to create a terminal based media center application using Ruby? With some more glue from RubyOSA (the Ruby/AppleEvent Bridge) it should’t be that hard.

Making open-uri play nice with HTTPS and expired certificates

I was using the open-uri library to download HTML in an accessibility test when I found that it does not work well when the remote site has an expired certificate. In this case open-uri will throw a “certificate expired” exception. This may be ok as a default behaviour, but there is no option to override the check.

Fortunately you can easily change the behaviour by editing the open-uri source. If you are on Windows it is available in C:\ruby\lib\ruby\1.8\open-uri.rb depending on your installation directory of course.

Somewhere around line 232 (in the version distributed in Ruby 1.8.5) you can see the certificate verification mode used:

http.verify_mode = OpenSSL::SSL::VERIFY_PEER

To skip certificate verification you can change VERIFY_PEER to VERIFY_NONE. The complete section should read:

if target.class == URI::HTTPS
require ‘net/https’
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
store =
http.cert_store = store

Google Code Search Reveals Anger, Frustration and Hate

Google’s Code search is a great way to spend an evening. Indexing a hefty amount of source code reveals anger, frustration and hate. Some favourites:

Interestingly, searching for “Ruby sucks” does not return any matching documents…