Recursive Descent Parser for Ruby
Sometimes strange things happen. I've been developing a small, basic recursive descent parser for Ruby called RDParse. Just before writing this post I decided to Google that name, and lo and behold the first result is a Ruby recursive descent parser called RDParse, created by Dennis Ranke, that I posted to Code Snippets for posterity several months ago. Since both of these libraries are unlikely to be used at once and that Dennis doesn't seem to be maintaining his version, I've decided to stick with RDParse as the name of mine for now.
You can download my RDParse as rdparse.rb.txt, just rename it at your end if you want to use it. To use it, you'd do something like this (no syntax coloring as the Syntax gem doesn't appear to cope with the complexity):
require 'rdparser' parser = RDParser.new do |g| g.main 'line(s)' g.line 'expression separator(?) comment(?)' g.comment '"#" rest_of_line' g.rest_of_line /.+$/ g.separator /;/ g.expression 'term operation expression | term' g.term 'number | variable | string | brkt_expression' g.brkt_expression '"(" expression ")"' g.number /d+(.d+)?/ g.operation /[+-*/]/ g.variable /[a-z][a-z0-9]*/ g.string %r(["'](.*?[^\]|.*?)["']) end content = %q{ (34 - 3) * 42; # Comment here.. "a" + "bcd" } syntax_tree = parser.parse(:main, content) puts RDParser.text_syntax_tree(syntax_tree)
Here, a grammar is defined within the RDParser.new block, although it can also be passed in as a parameter in a hash. This grammar (for a nonsense language that can only perform basic expressions) is used to fuel a parser that generates the following syntax tree from the code in the content variable:
line expression term brkt_expression "(" => ( expression term number => 34 operation => - expression term number => 3 ")" => ) operation => * expression term number => 42 separator => ; comment "#" => # rest_of_line => Comment here.. line expression term string => "a" operation => + expression term string => "bcd"
It's simple but effective. My initial version had callbacks and some other features that I discovered I didn't really need (and which I'd poorly implemented anyway). This version just does the basics, lexing and enough parsing to get a tree.
I'm not releasing it as a Gem or on RubyForge yet as it has a long way to go, but for anyone interested in this stuff, see what grammars you can knock up! Next step.. error reporting.
November 15, 2006 at 10:19 am
Could RDParse also benefit from the so-called packrat parsing algorithm in some way?
http://meta-meta.blogspot.com/2006/04/packrat-parsers.html
Pingback: Treetop - Powerful But Easy Ruby Parser Library