Archive for December, 2008

At last: scRUBYt! 0.4.1 is out

After more than a year, I’d like to announce a new release of scRUBYt! and set “scRUBYt!”.is_vaporware? = false. w00t!

Thanks to Glen Gillen, it is possible now to use FireWatir as the agent for navigation, enabling AJAX/more robust scraping via Firefox/FireWatir.

Another big news is that the RubyInline, ParseTree and Ruby2Ruby dependency was dropped since we couldn’t solve this problem for win32 for one year. Yay for the windows users (and other OS users juggling various versions of the above stuff).

Of course a lot of bugs were fixed as well.

On the non-source code front, we have

and probably other cool stuff which I can’t remember right now! Will update the article later.

What’s next?

The biggest news is that scRUBYt! is going to be rewritten from scratch - the work has already been started by Glenn Gillen. scRUBYt! has grown too big for our taste, so we decided to start anew, aiming for 100% rSpec coverage, refactored code, speed/performance optimization and leaving all the cruft behind. So scRUBYt! 0.4.1, the last one based on the original scRUBYt! will be supported until the new, rewritten one (0.5.0) comes out and takes it’s place.

TextMate Bundle for scRUBYt!

As stupid as this sounds from the original author after countless hours of scRUBYt! usage and development, I still had to occasionally open some older scrapers to get the exact logger, exporter, clicklinkand_wait etc. syntax. Even though I know 95% of the possible commands, I thought it’d be great to speed up the typing time - a typical scRUBYt! extractor has tons of boilerplate code.

So I decided to create a TextMate bundle and host it on github. It’s pretty rudimentary right now, consisting of about two dozens of snippets, but hey, it’s a start.

I bet scgoog->TAB will become a big favorite right away (spits the classical google extractor example into your editor) - but there are other usable snippets included as well. With their help it’s literally possible to create a scraper in a few seconds.

If you have further ideas, would like to contribute etc. please drop me a mail (scrubyt -nice try spambot! NOT.- at scrubyt dot org).