Classifier Gem: Bayesian and LSI Classification for Ruby
Classifier is a Ruby gem developed by Lucas Carlson and David Fayram II to allow Bayesian and other types of classifications, including Latent Semantic Indexing.
Bayes classifier is a probabilistic algorithm which apply Bayes’ theorem in order to learn the underlying probability distribution of the data. One popular use for this is implemented in most spam filtering packages.
It can also be applied to many other cases of machine learning to make your Ruby application more intelligent (the complicated implementation is transparently handled for you, thankfully!) Ilya Grigorik recently posted an interesting tutorial on Bayes classification, with an easy-to-follow demonstration on how to use it for distinguishing between funny vs. not funny quotes:
require 'rubygems' require 'stemmer' require 'classifier' # Load previous classifications funny = YAML::load_file('funny.yml') not_funny = YAML::load_file('not_funny.yml') # Create our Bayes / LSI classifier classifier = Classifier::Bayes.new('Funny', 'Not Funny') # Train the classifier not_funny.each { |boo| classifier.train_not_funny boo } funny.each { |good_one| classifier.train_funny good_one } # Let's classify some new quotes puts classifier.classify "Peter: A boat's a boat but a box could be anything! It could even be a boat!" puts classifier.classify "Stewie: Damn you ice cream, come to my mouth! How dare you disobey me!" puts classifier.classify "Brian: I could take my sweater off too, but I think it's attached to my skin. " puts classifier.classify "Peter: Hey, anybody got a quarter? Bill Gates: What's a quarter? " puts classifier.classify "Peter: I had such a crush on her. Until I met you Lois. You're my silver medal. " puts classifier.classify "Meg: Excuse me, Mayor West? Adam West: How do you know my language? " puts classifier.classify "Meg: You could kill all the girls who are prettier than me. Death: Well, that would just leave England. "
Alternatives and other useful resources: bn4r (article), Bishop, Microsoft Belief Network
May 24, 2007 at 5:12 pm
bn4r (Bayesian Networks for Ruby) recently joined forces with sbn (Simple Bayesian Networks), as they are very similar projects. The result will be hosted at http://rubyforge.org/projects/sbn4r/ .
May 25, 2007 at 2:09 pm
Dear Helder Ribeiro,
Thank you so much for your update.
And good luck for your Firewatir-Gen Google Summer of Code challenge! :-) Keep rocking!
May 28, 2007 at 10:00 am
There is also this clusterer gem http://rubyforge.org/projects/clusterer/, which has various types of Bayesian Classifier + clustering algorithms + LSI + many different stemming alternatives, though I admit its currently not very well documented. http://cuttingtheredtape.blogspot.com/2007/03/clusterer-other-plugins.html.
May 29, 2007 at 2:56 am
Dear Surendra Singhi,
Thank you for your information. It surely is very useful!