orngLookup: Functions for Working with Classifiers That Use a Stored Example Table

This module contains several functions for working with classifiers that use a stored example table for making predictions. There are four such classifiers; the most general stores an ExampleTable and the other three are specialized and optimized for cases where the domain contains only one, two or three attributes (besides the class attribute).


Functions

lookupFromBound(classVar, bound)

This function constructs an appropriate lookup classifier for one, two or three attributes. If there are more, it returns None. The resulting classifier is of type ClassifierByLookupTable, ClassifierByLookupTable2 or ClassifierByLookupTable3, with classVar and bound set set as given.

If, for instance, data contains a dataset Monk 1 and you would like to construct a new feature from attributes a and b, you can call this function as follows.

>>> newvar = orange.EnumVariable() >>> bound = [data.domain[name] for name in ["a", "b"] >>> lookup = orngLookup.lookupFromBound(newvar, bound) >>> print lookup.lookupTable <?, ?, ?, ?, ?, ?, ?, ?, ?>

Function orngLookup.lookupFromBound does not initialize neither newVar nor the lookup table...

lookupFromFunction(classVar, bound, function)

... and that's exactly where lookupFromFunction differs from lookupFromBound. lookupFromFunction first calls lookupFromBound and then uses the function to initialize the lookup table. The other difference between this and the previous function is that lookupFromFunction also accepts bound sets with more than three attributes. In this case, it construct a ClassifierByExampleTable.

The function gets the values of attributes as integer indices and should return an integer index of the "class value". The class value must be properly initialized.

For exercise, let us construct a new attribute called a=b whose value will be "yes" when a and b or equal and "no" when they are not. We will then add the attribute to the dataset.

>>> bound = [data.domain[name] for name in ["a", "b"]] >>> newVar = orange.EnumVariable("a=b", values = ["no", "yes"]) >>> lookup = orngLookup.lookupFromFunction(newVar, bound, lambda x: x[0]==x[1]) >>> newVar.getValueFrom = lookup >>> import orngCI >>> data2 = orngCI.addAnAttribute(newVar, data) >>> for i in data2[:30]: ... print i ['1', '1', '1', '1', '1', '1', 'yes', '1'] ['1', '1', '1', '1', '1', '2', 'yes', '1'] ['1', '1', '1', '1', '2', '1', 'yes', '1'] ['1', '1', '1', '1', '2', '2', 'yes', '1'] ... ['2', '1', '2', '3', '4', '1', 'no', '0'] ['2', '1', '2', '3', '4', '2', 'no', '0'] ['2', '2', '1', '1', '1', '1', 'yes', '1'] ['2', '2', '1', '1', '1', '2', 'yes', '1'] ...

The attribute was inserted with use of orngCI.addAnAttribute. By setting newVar.getValueFrom to lookup we state that when converting domains (either when needed by addAnAttribute or at some other place), lookup should be used to compute newVar's value. (A bit off topic, but important: you should never call getValueFrom directly, but always call it through computeValue.)

lookupFromExamples(examples [, weight])

This function takes a set of examples (ExampleTable, for instance) and turns it into a classifier. If there are one, two or three attributes and no ambiguous examples (examples are ambiguous if they have same values of attributes but with different class values), it will construct an appropriate ClassifierByLookupTable. Otherwise, it will return an ClassifierByExampleTable.

>>> lookup = orngLookup.lookupFromExamples(data) >>> testExample = orange.Example(data.domain, ['3', '2', '2', '3', '4', '1', '?']) >>> lookup(testExample) <orange.Value 'y'='0'>

printLookupFunction(func)

printLookupFunction returns a string with a lookup function in tab-delimited format. Argument func can be any of the abovementioned classifiers or an attribute whose getValueFrom points to one of such classifiers.

Module orngLookup sets the output for those classifiers using the orange output schema. This means that you don't need to call printLookupFunction directly. Use dump and write functions instead.

For instance, if lookup is such as constructed in example for lookupFromFunction, you can print it out by

>>> print lookup.dump("tab") a b a=b ------ ------ ------ 1 1 yes 1 2 no 1 3 no 2 1 no 2 2 yes 2 3 no 3 1 no 3 2 no 3 3 yes

Function write writes it to file, either a new one

>>> lookup.write("tab", "d:\\t.txt")

or to an already open file (this way you can write more things to one file)

>>> lookup.write("tab", f)