Classifiers from attribute predict the class based on the value of a single attribute. While they can be used for making predictions, they actually play a different, yet important role in Orange. They are used not to predict class values but to compute attribute's values. For instance, when a continuous attribute is discretized and replaced by a discrete attribute, an instance of a classifier from attribute takes care of automatic value computation when needed. Similarly, a classifier from attribute usually decides the branch when example is classified in decision trees.
There are two classifiers from attribute; the simpler ClassifierFromVarFD
supposes that example is from some fixed domain and the safer ClassifierFromVar
does not. You should primarily use the latter, moreover since it uses a caching schema which helps the class to be practically as fast as the former.
Both classifiers can be given a transformer that can modify the value. In discretization, for instance, the transformer is responsible to compute a discrete interval for a continuous value of the original attribute.
Attributes
TransformValue
, but you can also use a callback function.whichVar
's value is undefined.
When given an example
, ClassifierFromVar
will return transformer(example[whichVar])
. whichVar
can be either an ordinary attribute, a meta attribute or an attribute which is not defined for the example but has getValueFrom
that can be used to compute the value. If none goes through or if the value found is unknown, a Value
of subtype Distribution
containing distributionForUnknown
is returned.
The class stores the domain version for the last example and its position in the domain. If consecutive examples come from the same domain (which is usually the case), ClassifierFromVar
is just two simple if
s slower than ClassifierFromVarFD
.
As you might have guessed, the crucial component here is the transformer
. Let us, for sake of demonstration, load a Monk 1 dataset and construct an attribute e1
that will have value "1", when e
is "1", and "not 1" when e
is different than 1. There are many ways to do it, and that same problem is covered in different places in Orange documentation. Although the way presented here is not the simplest, it will serve to demonstrate how ClassifierFromVar
works.
part of part of classifierFromVar.py (uses monk1.tab)
As first, you might have noticed that transformer
, an attribute of a pure C++ object ClassifierFromVar
, has been assigned a Python function. As you can learn by reading the documentation on callback functions, the function itself gets automatically wrapped into a C++ class that performs the argument conversion to Python and back. (Not that you need to know about it. Just use it and be happy that it works.)
The problem here is that the eTransformer
doesn't get the nice instances of orange.Value
that you are used to. You cannot compare the value to a string - the function cannot begin by "if value == "1"
", since the value
has no associated attribute descriptor that would "understand" the string "1". Instead, you need to use integer indices. Since values of e
are "1", "2", "3", "4", index 0 corresponds to value "1". The same goes for returning values; values of e1
are "1" and "not 1", in this order, so returning 0 says "1" and returning 1 says "not 1".
Having written the transformer, the rest is trivial - we assign a ClassifierFromVar
to the new attribute's getValueFrom
, and set its whichVar
to e
and transformer
to eTransformer
.
To check the results, we constructed a new example table containing only attributes a
, b
and e
, the new attribute e1
and the class attribute. For example conversion, the value of e1
is computed by calling ClassifierFromVar
and the overall effect is that for each example ex
, e1
has value eTransformer(ex[e])
.
ClassifierFromVarFD
is very similar to ClassifierFromVar
except that the attribute is not given as a descriptor (like whichVar
) but as an index. The index can be either a position of the attribute in the domain or a meta-id. Given that ClassifierFromVarFD
is practically no faster than ClassifierFromVar
(and can in future even be merged with the latter), you should seldom need to use the class.
Attributes
ClassifierFD
)whichVar
's value is undefined.
When an example is passed to ClassifierFromVarFD
, it is first checked whether it is from the correct domain
; an exception is raised if not. If the domain is OK, the corresponding attribute value is retrieved, transformed and returned.
ClassifierFromVarFD
's twin brother, ClassifierFromVar
, can also handle attributes that are not in the examples' domain or meta-attributes, but can be computed therefrom by using their getValueFrom
. Since ClassifierFromVarFD
doesn't store attribute descriptor but only an index, such functionality is obviously impossible.
To rewrite the above script to use ClassifierFromVarFD
, we need to set the domain and the e
's index to position
(equivalent to setting whichVar
in ClassifierFromVar
). The initialization of ClassifierFromVarFD
thus goes like this:
part of part of classifierFromVar.py (uses monk1.tab)