Writing Parsers in Ruby using Treetop
Treetop is one of the most underrated, yet powerful, Ruby libraries out there. If you want to write a parser, it kicks ass. The only problem is unless you're into reading up about and playing with parsers, it's not always obvious how to get going with them, or Treetop in particular. Luckily Aaron Gough, Toronto-based Ruby developer, comes to our rescue with some great blog posts.
Aaron, who has a passion for messing around with parsers and language implementations, recently released Koi - a pure Ruby implementation of a language parser, compiler, and virtual machine. If you're ready to dive in at the deep end, the code for Koi makes for good reading.
Starting more simply, though, is Aaron's latest blog post: A quick intro to writing a parser with Treetop. In the post, he covers building a "parsing expression grammar" (PEG) for a basic Lisp-like language from start to finish - from installing the gem, through to building up a desired set of results. It's a great walkthrough and unless you're already au fait with parsers, you'll pick something up.
If thinking of "grammars" and Treetop is enough to make your ears itch, though, check out Aaron's sister article: Writing an S-Expression parser in Ruby. On the surface, this sounds like the same thing as the other one, except that this is written in pure Ruby with no Treetop involvement. But while pure Ruby is always nice to see, it's a stark reminder of how much a library like Treetop offers us.
If you're interested in parsing merely as a road to creating your own programming language, though, check out Create Your Own Programming Language by Marc Andre Cournoyer. It's a good read and even inspired CoffeeScript!
[ad] Check out Rails Cloud Hosting from Joyent. You can get started from a mere 83 cents per day and you get free bandwidth and persistent local storage with a 100% SLA.
November 3, 2010 at 7:06 pm
Treetop is great, but I've moved to Citrus. Citrus lets you use regular expressions in rules. Also, it doesn't require that the whole input match a rule.
December 27, 2010 at 11:51 pm
Treetop has an option setting whether input must match the whole rule or not, it just defaults to yes.