Cost Matrix

CostMatrix is an object that stores costs of (mis)classifications. Costs can be either negative or positive.

Attributes

classVar
The (class) attribute to which the matrix applies. This can also be None.
dimension (read only)
Matrix dimension, ie. number of classes.

Methods

CostMatrix(dimension[, default cost])
Constructs a matrix of the given size and initializes it with the default cost (1, if not given). All elements of the matrix are assigned the given cost, except for the diagonal that have the default cost of 0. (Diagonal elements represent correct classifications and these usually have no price; you can, however, change this.)

part of CostMatrix.py

import orange cm = orange.CostMatrix(3) print "classVar =", cm.classVar for pred in range(3): for corr in range(3): print cm.getcost(pred, corr), print

This initializes the matrix and print it out:

0.0 1.0 1.0 1.0 0.0 1.0 1.0 1.0 0.0

CostMatrix(class descriptor[, default cost])
Similar as above, except that classVar is also set to the given descriptor. The number of values of the given attribute (which must be discrete) is used for dimension.

part of CostMatrix.py (uses iris.tab)

data = orange.ExampleTable("iris") cm = orange.CostMatrix(data.domain.classVar, 2)

This constructs a matrix similar to the one above (the class attribute in iris domain is three-valued) except that the matrix contains 2s instead of 1s.

CostMatrix([attribute descriptor, ]matrix)
Initializes the matrix with the elements given as a sequence of sequences (you can mix lists and tuples if you find it funny). Each subsequence represents a row.

part of CostMatrix.py (uses iris.tab)

cm = orange.CostMatrix(data.domain.classVar, [(0, 2, 1), (2, 0, 1), (2, 2, 0)])

If you print this matrix out, will it look like this:

0.0 2.0 1.0 2.0 0.0 1.0 2.0 2.0 0.0
setcost(predicted value, correct value, cost)
Set the misclassification cost.

The matrix above could be constructed by first initializing it with 2s and then changing the prices for virginica's into 1s.

part of CostMatrix.py (uses iris.tab)

cm = orange.CostMatrix(data.domain.classVar, 2) cm.setcost("Iris-setosa", "Iris-virginica", 1) cm.setcost("Iris-versicolor", "Iris-virginica", 1)
getcost(predicted value, correct value)
Returns the cost of prediction. Values must be integer indices; if classVar is set, you can also use symbolic values (strings).

Note that there's no way to change the size of the matrix. Size is set at construction and does not change.

For the final example, we shall compute the profits of knowing attribute values in the dataset lenses with the same cost-matrix as printed above.

part of CostMatrix.py (uses lenses.tab)

data = orange.ExampleTable("lenses") meas = orange.MeasureAttribute_cost() meas.cost = ((0, 2, 1), (2, 0, 1), (2, 2, 0)) for attr in data.domain.attributes: print "%s: %5.3f" % (attr.name, meas(attr, data))

As the script shows, you don't have to (and usually won't) call the constructor explicitly. Instead, you will set the corresponding field (in our case meas.cost) to a matrix and let Orange convert it to CostMatrix automatically.

Funny as it might look, but since Orange uses constructor to perform such conversion, even the above statement is correct (although the cost matrix is rather dull, with 0s on the diagonal and 1s around):

meas.cost = data.domain.classVar