scRUBYt!
WWW::Mechanize and Hpricot on Steroids

Briefly...
scRUBYt! is a simple to learn and use, yet powerful web scraping toolkit written in Ruby. The idea behind making scRUBYt! was to show a few simple concepts of Web extraction as a practical extension of this tutorial.
February 6th, 2007 at 3:03 pm
hmm upon running this i recieve a rather odd error
[ACTION] fetched http://scrubyt.org/wp-comments-post.php
NoMethodError: undefined method
parent' for #<Hpricot::Doc "Error: This file cadefine’nnot be used on its own.">
from f:/ruby/lib/ruby/gems/1.8/gems/scrubyt-0.2.0/lib/scrubyt/extractor.
rb:30:in
from (irb):3
February 6th, 2007 at 3:08 pm
Well, the problem is that this extractor runs with Zaheed’s patch only which I did not have time to apply yet. If you are looking for examples, please find them here:
http://rubyforge.org/frs/download.php/17170/scrubyt-examples-0.2.0.zip
btw. there some examples need they examples to be updated (e.g. digg, ebay) because the examples used in them are no more valid! More on this soon….
February 9th, 2007 at 9:11 am
is there a way to use xpath to find the node we want instead of put some text as an example, text changes more often than nodes and html structure.
February 9th, 2007 at 9:20 am
andre,
First of all, the examples are there just to learn the rules of extraction - after this they should be discarded!
Suppose you have an extractor named andres_extractor. Then simply do this:
andres_extractor.export(FILE)
check the directory from where you launched the script, and voila! there should be a file named andresextractorexported.rb. That should be it.
I am just writing the next tutorial where I will show how to use different types of examples - XPath is also one of the possibilities. You can simply write this:
price ‘/table/tr[1]/td[1]’
instead of e.g.
prince ‘$67.34′
Check also the extracted file to see XPath examples in action. You can also check out the next tutorial for more explanation.
I suggest to subsrcibe to the feed so you get notified if there are new tutorials/examples/anything.
February 9th, 2007 at 9:21 am
sorry, it should be
markdown screwed that up…
June 20th, 2007 at 6:53 pm
Hello! Good Site! Thanks you! mxydxxykfqml
June 27th, 2007 at 11:46 am
Thanks for this site!
hifue.info
April 8th, 2008 at 8:26 am
Posting a comment using my first scRUBYt script. Lets see if it works I will publish the code snippet here