orngBayes: A Helper Module for Naive Bayesian Classifier

Module orngBayes contains a Bayesian learner which uses the one from Orange, with the only difference that using the m-estimate is much simpler than with the learner built in Orange. Besides, the module has a function for printing out the classifier in a textual form.

Bayes Learner

Class orngBayes.BayesLearner is similar to orange.BayesLearner (the former in fact uses the latter) except that for estimating the probabilities with m-estimates you don't need to set a bunch of probability estimators, you can simply specify the value of m. To show how, let's compare a naive Bayesian classifier with m-estimate (m=2) with one that uses relative frequencies to estimate probabilities.

part of bayes.py

import orange, orngBayes, orngTest, orngStat data = orange.ExampleTable("lung-cancer") bayes = orngBayes.BayesLearner() bayes_m = orngBayes.BayesLearner(m=2) res = orngTest.crossValidation([bayes, bayes_m], data) CAs = orngStat.CA(res) print print "Without m: %5.3f" % CAs[0] print "With m=2: %5.3f" % CAs[1]

Attributes

All the attributes listed here are optional in the sense that they appear and are used only if you set them.

m
m for m-estimate. If you set it, m-estimation of probabilities will be used (through class orange.ProbabilityEstimatorConstructor_m). This attribute is ignored if you also set estimatorConstructor.
estimatorConstructor
Probability estimator constructor to be used for a priori class probabilities. If set, it should be an instance of a class derived from orange.ProbabilityEstimatorConstructor. Setting this attribute disable the above described attribute m.
conditionalEstimatorConstructor
Probability estimator constructor for conditional probabilities for discrete attributes. If the attribute is omitted, the same estimator will be used as for a priori class probabilities.
conditionalEstimatorConstructorContinuous
As above, but for continuous attributes.

If none of these attributes is given, relative frequencies are used for a priori class probabilities and conditional probabilities of discrete attributes, and loess is used for continuous attributes.

Methods

Constructor
As usual, the constructor can be given no arguments or keyword arguments (such as m=2) which are copied to the object's attributes; the constructor will then return an instance of orngBayes.BayesLearner. Or it can be given learning examples (and possibly a weight meta attribute), it will construct and return a classifier (an instance of orange.BayesClassifier).
__call__(self, examples, weightID = 0)
Call the learner with examples as an argument (and, optionally the id of meta attribute with weights) and you'll get the classifier, as usual. The classifier will be an instance of orange.BayesClassifier, module orngBayes doesn't provide a special classifier.
createInstance(self)
This function returns an instance of orange.BayesLearner with the components set as defined by the attributes. Actually, when you call orngBayes.BayesLearner it calls createInstance to construct an appropriate orange.BayesLearner and forwards the call to it.

Printing out the model

To print out the model in form of contingency matrices, call function orngBayes.printModel(model), like the code below:

part of bayes.py

import orange, orngBayes data = orange.ExampleTable("voting") model = orngBayes.BayesLearner(data) orngBayes.printModel(model)

The output will start with

republican democrat class probabilities 0.386 0.614 Attribute handicapped-infants republican democrat n 0.568 0.432 y 0.166 0.834 Attribute water-project-cost-sharing republican democrat y 0.385 0.615 n 0.380 0.620 ... and so on through all the attributes.

This function is unable to print out the model if it contains continuous attributes or advanced probability estimators which wouldn't store the pre-computed probabilities (we have none like this at the moment).