<?xml version="1.0" encoding="utf-8"?>
<!-- generator="FeedCreator 1.7.2-ppt DokuWiki" -->
<?xml-stylesheet href="http://www.peterkrantz.com/simplecrawler/wiki/lib/exe/css.php?s=feed" type="text/css"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="http://www.peterkrantz.com/simplecrawler/wiki/feed.php">
        <title>Simple Crawler</title>
        <description></description>
        <link>http://www.peterkrantz.com/simplecrawler/wiki/</link>
        <image rdf:resource="http://www.peterkrantz.com/simplecrawler/wiki/lib/images/favicon.ico" />
       <dc:date>2010-03-14T02:45:01+00:00</dc:date>
        <items>
            <rdf:Seq>
                <rdf:li rdf:resource="http://www.peterkrantz.com/simplecrawler/wiki/examples/find-broken-links"/>
                <rdf:li rdf:resource="http://www.peterkrantz.com/simplecrawler/wiki/start"/>
                <rdf:li rdf:resource="http://www.peterkrantz.com/simplecrawler/wiki/principles"/>
                <rdf:li rdf:resource="http://www.peterkrantz.com/simplecrawler/wiki/examples/accessibility-report"/>
            </rdf:Seq>
        </items>
    </channel>
    <image rdf:about="http://www.peterkrantz.com/simplecrawler/wiki/lib/images/favicon.ico">
        <title>Simple Crawler</title>
        <link>http://www.peterkrantz.com/simplecrawler/wiki/</link>
        <url>http://www.peterkrantz.com/simplecrawler/wiki/lib/images/favicon.ico</url>
    </image>
    <item rdf:about="http://www.peterkrantz.com/simplecrawler/wiki/examples/find-broken-links">
        <dc:format>text/html</dc:format>
        <dc:date>2009-05-04T21:40:44+00:00</dc:date>
        <dc:creator>Peter Krantz</dc:creator>
        <title>examples:find-broken-links</title>
        <link>http://www.peterkrantz.com/simplecrawler/wiki/examples/find-broken-links</link>
        <description>Find broken links

 This is an example of how SimpleCrawler can be used to find broken links on a website (links with HTTP status 404). In the example the site in the command line argument is crawled. 


require 'simplecrawler'

# Mute log messages
module SimpleCrawler
   class Crawler
      def log(message)
      end
   end
end

# Set up a new crawler
sc = SimpleCrawler::Crawler.new(ARGV[0])

# Crawl first 100 links
sc.maxcount = 100

sc.crawl { |document|
   if document.http_status[0] != &quot;200&quot;…</description>
    </item>
    <item rdf:about="http://www.peterkrantz.com/simplecrawler/wiki/start">
        <dc:format>text/html</dc:format>
        <dc:date>2009-05-04T21:03:53+00:00</dc:date>
        <dc:creator>Peter Krantz</dc:creator>
        <title>start</title>
        <link>http://www.peterkrantz.com/simplecrawler/wiki/start</link>
        <description>With SimpleCrawler (SC) basic web crawling becomes easy in Ruby. Use SimpleCrawler as the foundation for your own crawling needs. 

SC is inspired by code in an article by Scott Nedderman (which didn't work properly for me).

A minimal example (crawl a website and print page titles):</description>
    </item>
    <item rdf:about="http://www.peterkrantz.com/simplecrawler/wiki/principles">
        <dc:format>text/html</dc:format>
        <dc:date>2008-12-09T13:09:41+00:00</dc:date>
        <dc:creator>Peter Krantz</dc:creator>
        <title>principles</title>
        <link>http://www.peterkrantz.com/simplecrawler/wiki/principles</link>
        <description>Basic Principles of using SimpleCrawler

 Using SC involves the following steps:

1. Require the simplecrawler library


require 'simplecrawler'


2. Create an instance of the SimpleCrawler::Crawler object

Pass the website address as a parameter if you like.</description>
    </item>
    <item rdf:about="http://www.peterkrantz.com/simplecrawler/wiki/examples/accessibility-report">
        <dc:format>text/html</dc:format>
        <dc:date>2008-01-26T13:37:55+00:00</dc:date>
        <dc:creator>Peter Krantz</dc:creator>
        <title>examples:accessibility-report</title>
        <link>http://www.peterkrantz.com/simplecrawler/wiki/examples/accessibility-report</link>
        <description>Site Accessibility Report with Raakt and Ruport

In this example SimpleCrawler can be used to crawl a website, test each page with the Ruby Accessibility Analysis Kit and format the result with Ruport. Please note that you must have a working installation of the Ruby programming language and the RubyGems package manager to use this code.</description>
    </item>
</rdf:RDF>
