Parse XML quickly and easily with Hpricot
Following on from the Parsing XML with REXML using Expat post about using Expat to make REXML faster, Chris Wanstrath e-mailed me to let me know about his co-worker PJ's post, "Parse XML with Hpricot". Hpricot, covered previously in Fast HTML parsing in Ruby with Hpricot, is a fast HTML parser for Ruby written mostly in C by Ruby legend whytheluckystiff.
PJ says that as a subset of XML, Hpricot should work fine with raw XML, and it does:
FIELDS = %w[SKU ItemName CollectionNo Pages] doc = Hpricot.parse(File.read("my.xml")) (doc/:product).each do |xml_product| product = Product.new for field in FIELDS product[field] = xml_product.search("/#{field}").first.children.first.raw_string end product.save end
There's less hoops to jump through than with the REXML/Expat route, and it's still extremely fast. Learn more.