Examples are usually stored in a table called
orange.ExampleTable
. In Python you will perceive it a as
list and this is what it basically is: an ordered sequence of
examples, supporting the usual Python procedures for lists, including
the more advanced operations such as slicing and sorting.
data
is an instance of ExampleTable
, data[0]
is not a reference to the first element but the first element itself. About the only case in which this is important is when we try to swap two elements, either like data[0], data[1] = data[1], data[0]
or with an intermediate variable: it won't work. For the same reason: random.shuffle doesn't work on ExampleTable
(as it doesn't on numpy, by the way). Use ExampleTable
's own shuffle
method instead.
ExampleTable
is derived from a more general abstract class ExampleGenerator
.
Attributes
ExampleTable
contains copies of examples (true
) or only references to examples owned by another table (stored in field lock
). Example tables with references to examples are useful for sampling examples from ExampleTable
without copying examples.ExampleTable
is changed. This is not foolproof, since ExampleTable
cannot detect when individual examples are changed. It will, however, catch any additions and removals from the tablerandomexample
. If the method is called and randomGenerator
is None
a new generator is constructed with random seed 0, and stored here for subsequent use. If you would like to have different random examples each time your script is run, use a random number from Python for a random seed.ExampleTable
can be constructed by reading from file, packing existing examples or creating an empty table. To save the data, see the documentation on file formats.
ExampleTable
for the given domain.
For exercise, we shall construct a domain for the common version of Monk datasets; attribute names will be a
, b
, c
, d
, e
, and f
, and their values will be 1, 2, 3, and 4. Attribute f
is four-valued, a
, b
and d
are three-values and c
is binary.
part of exampletable1.py
Attributes are defined in a list comprehension where i
goes from 0 to 5 (for six attributes), attribute name is chr(97+i)
, which gives letters from a to f, and attribute's values are a slice from list values - exactly as many values as specified in card
for each particular attribute. If you don't understand this, don't mind and pretend that all attributes are defined just as simply as the class attribute.
ExampleTable
. Examples can be given either with ExampleGenerator
, such as ExampleTable
, or as an ordinary Python list containing examples (as objects of type Example
).
If the optional second argument is true, the new ExampleTable
will only store references to examples. In this case, the first argument must be ExampleTable
, not a list.
ExampleGenerator
, a Python list containing examples as objects of type Example
or Python lists, or Numeric array, if your Orange build supports it.
If you have examples stored in a list of lists, for instance
you can convert it into an ExampleTable
by
Instead of strings (ie, symbolic values) you can use value indices in loe
, when you find it more appropriate:
The other way of putting such examples into an ExampleTable
is by method extend
.
Finally, here's an example that puts a content of Numeric array into an ExampleTable
.
For this example, we first constructed a domain with attributes a1
, a2
, a3
, a4
and a5
. We then put together a simple Numeric array
with five columns and put it into a table.
exampletable_merge.py (uses merge1.tab, merge2.tab)
First, note the use = data1.domain
which ensures that while loading the second table, the attributes from the first will be reused if they are of same name and type. Without that, the attribute a1
from the first and the attribute a2
from the second table would be two different attributes and the merged table would have two attributes named a1
instead of a single one, which is what we want. The same goes for meta-attribute m1
which will also have the same id in both table. (For this reason, it is important to pass the entire domain, ie data1.domain
, not a list of attributes, such as data1.domain.variables
or - obviously intentionally doing it wrong -- data1.domain.variables + data1.domain.getmetas().values()
.)
Merging succeeds since the values of a1
and m1
are the same for all matching examples from data1
and data2
, and the printout is as anticipated.
ExampleTable
supports most of standard Python operations on lists. All the basic operations - getting, setting and removing examples and slices are supported.
Examples
) from the table, you get references to examples not copies. So, when you write ex = data[0]
and then modify ex
, you will actually change the first example in the data
. If the table contains references to examples, it can only contain references to examples in a single table, so when you assign items, eg. by data[10]=example
, the example
must come from the right table.
When setting items, you can present examples as object of type Example
or as ordinary list, for instance, data[0] = ["1", "1", 1, "1", "1", "1", "1"]
. This form can, of course, only be used by ExampleTable
that own examples.
data[:10]
, for instance, gives the first ten examples from data
. These examples are not returned in an ExampleTable
but in ordinary Python list, containing references to examples in the table. For instance, to do something with the first n
examples, you can use a loop like this.
As for ordinary lists, this is somewhat slower than
But you probably won't notice the difference except in really large tables.
If the table contains references to examples, similar restrictions as for assigning items apply.
if
statement.
part of exampletable1.py
For each example, we prepare a list of six random values, ranging from 0 to the cardinality of the attribute (randint(0, c-1)
returns a random value from between 0
and c-1
, inclusive). To this we append a class value, computed according to Monk 1's concept. The constructed list is appended to the table.
Restrictions apply for tables that contain references to examples.
append
for each example in the list.
Restrictions apply for tables that contain references to examples.
ExampleTable
into an ordinary Python list. If nativity
is 2 (default), the list contains objects of type Example
(references to examples in the table). If 1, even examples are replaced by lists containing objects of type Value
(therefore, ExampleTable
is translated to a list of list of Value
). If nativity
is 0, even values are represented as native Python objects - strings and numbers.
ExampleTable
offers several methods for selection and translation of examples (some of them are actually inherited from a more general class ExampleGenerator
). For easier illustration, we shall prepare an example table with 10 examples, described by a single numerical attribute having values from 0 to 9 (effectively enumerating the examples).
part of exampletable2.py
ExampleGenerator
)select
returns a subset of examples. The argument is a list of integers of the same length as the examples table. select
picks the examples for which the corresponding list's element is equal to the second (optional) argument. If the latter is omitted, example for which the corresponding element is non-zero are selected. An additional keyword argument negate=1
reverses the selection.
Note: select
used to have many other functions, which are now deprecated and only kept for compatibility. We shall not document them, except for one that may cause unexpected behaviour. Say we have a data set which does not contain three examples (can have more of less). Calling select([0, 1, 5])
will return a table containing only the first, second and sixth attribute. In other words, if you use select
like described above (and below), but give it a list of a wrong size, the call will be interpreted as if you want to change the domain. Don't purposely call select
to change the domain.
The most natural use of this method is for division of examples into folds. For this, we first prepare a list of fold indices using an appropriate descendant of MakeRandomIndices
; MakeRandomIndicesCV
, for instance, will prepare indices for cross-validation (see documentation on random indices). Then we feed the indices to select
, as shown in example below.
part of exampletable2.py
The printout begins with.
For the first fold (0), the positions of zero's determine the examples that are selected for testing - these are examples at positions 1, 4 and 6 (don't forget that indices in Python start with zero).
Another form of calling function select
is by giving a list of integers that are interpreted as boolean values.
part of exampletable2.py
This form can also be given the negate
as keyword argument to reverse the selection.
For compatibility reasons, select
method still has some additional functionality which has been moved to methods filter
and translate
.
ExampleTable
s, naturally) if they called selectref
instead of select
.selectref
and then native
.ExampleGenerator
)indices
gives a list of indices of examples to be selected. Selected examples are returned in example table. For instance, calling data.getitems([0, 1, 9])
gives the same result as the above data.select([1, 1, 0, 0, 0, 0, 0, 0, 0, 1]
: a (new) ExampleTable
with examples data[0]
, data[1]
and data[9]
. Calling data.getitems(range(10))
has a similar effect than data[:10]
, except that the former returns an example table and the latter returns ordinary list.getitems
, except that the resulting table contains references to examples instead of copies.condition
. These can be given in form of keyword arguments or a dictionary; with the latter, additional keyword argument negate
can be given for selection reversal. Result is a new ExampleTable
.
For instance, young patients in the lenses dataset can be selected by
More than one value can be allowed and more than one attribute checked. To select all patients with age "young" or "psby" who are astigmatic, use
If you need the reverse selection, you cannot simply add negate=1
as in select
method, since this would be interpreted simply as another attribute (negate
) whose value needs to be 1 (e.g. values[1]
, see documentation on Variable
). For negation, you should use somewhat less readable way to pass arguments to filter
- you should pack them to a dictionary. For instance, to select examples that are not young and astigmatic, use
Note that this selects patients that are young, but not astigmatic and those that are astigmatic, but not young. In essence, conjunction of conditions is computed first and the result is negated if negate
is 1. If you need more flexible selection (e.g. disjunction instead of conjunction), see documentation on preprocessors.
Continuous attribute values are specified by pairs of values. In dataset "bridges", bridges with lengths between 1000 and 2000 (inclusive) are selected by
Bridges that are shorter or longer than that selected by inverting the range.
filt
of type orange.Filter
.filter
also have variants that return tables and lists of references to examples, analogous to methods selectref
and selectlist
.
keepMetas
is 1, the new domain will also include all meta attributes frmo the original domain.To select random examples, ExampleTable
uses a random number generator stored in the field randomGenerator
. If it has none, a new one is constructed and initialized with random seed 0. As a consequence, such a script will always select the same examples. If you don't want this, create another random generator and use a random number from Python to initialize it.
Since Orange calls constructors when an object of incorrect type is assigned to a built-in attribute, this can be written in a shorter form as
weightID
is given, a meta-value is added to each example to contain the sum of weights of all examples merged into a particular example.values
table (see documentation on Variable
).
Examples in dataset "bridges" can be sorted by lengths and years they were erected by data.sort("LENGTH", "ERECTED")
.
random.shuffle
, which does not work for ExampleTable
.None
. This function is not available for tables that contain references to examples.
Adding a meta-value to all examples in a table is a very common operation, deserving specialized functions. There are two, one for adding and the other for removing a meta-value.
id
can be an integer returned by orange.newmetaid()
, or a string or an attribute description if meta-attribute is registered in table's domain
. If value
is given, it must be something convertible to a Value
. If a corresponding meta-attribute is registered with domain
, value can be symbolical. Otherwise, it must be an index (to values
), continuous number or an object of type Value
. value
is an optional argument; default is 1.0 (to be useful as a neutral weight for examples).id
can be an integer or a string or an attribute description registered with the domain.