Cost Matrix
CostMatrix
is an object that stores costs of (mis)classifications. Costs can be either negative or positive.
Attributes
- classVar
- The (class) attribute to which the matrix applies. This can also be
None
.
- dimension (read only)
- Matrix dimension, ie. number of classes.
Methods
- CostMatrix(dimension[, default cost])
- Constructs a matrix of the given size and initializes it with the default cost (1, if not given). All elements of the matrix are assigned the given cost, except for the diagonal that have the default cost of 0. (Diagonal elements represent correct classifications and these usually have no price; you can, however, change this.)
import orange
cm = orange.CostMatrix(3)
print "classVar =", cm.classVar
for pred in range(3):
for corr in range(3):
print cm.getcost(pred, corr),
print
This initializes the matrix and print it out:
0.0 1.0 1.0
1.0 0.0 1.0
1.0 1.0 0.0
- CostMatrix(class descriptor[, default cost])
- Similar as above, except that
classVar
is also set to the given descriptor. The number of values of the given attribute (which must be discrete) is used for dimension.
data = orange.ExampleTable("iris")
cm = orange.CostMatrix(data.domain.classVar, 2)
This constructs a matrix similar to the one above (the class attribute in iris domain is three-valued) except that the matrix contains 2s instead of 1s.
- CostMatrix([attribute descriptor, ]matrix)
- Initializes the matrix with the elements given as a sequence of sequences (you can mix lists and tuples if you find it funny). Each subsequence represents a row.
cm = orange.CostMatrix(data.domain.classVar, [(0, 2, 1), (2, 0, 1), (2, 2, 0)])
If you print this matrix out, will it look like this:
0.0 2.0 1.0
2.0 0.0 1.0
2.0 2.0 0.0
- setcost(predicted value, correct value, cost)
- Set the misclassification cost.
The matrix above could be constructed by first initializing it with 2s and then changing the prices for virginica's into 1s.
cm = orange.CostMatrix(data.domain.classVar, 2)
cm.setcost("Iris-setosa", "Iris-virginica", 1)
cm.setcost("Iris-versicolor", "Iris-virginica", 1)
- getcost(predicted value, correct value)
- Returns the cost of prediction. Values must be integer indices; if
classVar
is set, you can also use symbolic values (strings).
Note that there's no way to change the size of the matrix. Size is set at construction and does not change.
For the final example, we shall compute the profits of knowing attribute values in the dataset lenses with the same cost-matrix as printed above.
data = orange.ExampleTable("lenses")
meas = orange.MeasureAttribute_cost()
meas.cost = ((0, 2, 1), (2, 0, 1), (2, 2, 0))
for attr in data.domain.attributes:
print "%s: %5.3f" % (attr.name, meas(attr, data))
As the script shows, you don't have to (and usually won't) call the constructor explicitly. Instead, you will set the corresponding field (in our case meas.cost
) to a matrix and let Orange convert it to CostMatrix
automatically.
Funny as it might look, but since Orange uses constructor to perform such conversion, even the above statement is correct (although the cost matrix is rather dull, with 0s on the diagonal and 1s around):
meas.cost = data.domain.classVar