Basket format

Basket files (.basket) are suitable for representing sparse data. Each example is represented by a line in the file. The line is written as a comma-separated list of name-value pairs. Here's an example of such file.

nobody, expects, the, Spanish, Inquisition=5 our, chief, weapon, is, surprise=3, surprise=2, and, fear,fear, and, surprise our, two, weapons, are, fear, and, surprise, and, ruthless, efficiency to, the, Pope, and, nice, red, uniforms, oh damn

The file contains four examples. The first examples has five attributes defined, "nobody", "expects", "the", "Spanish" and "Inquisition"; the first four have (the default) value of 1.0 and the last has a value of 5.0.

The attributes that appear in the domain aren't defined in any headers or even separate files, as with other formats supported by Orange.

If attribute appears more than once, its values are added. For instance, the value of attribute "surprise" in the second examples is 6.0 and the value of "fear" is 2.0; the former appears three times with values of 3.0, 2.0 and 1.0, and the latter appears twice with value of 1.0.

All attributes are loaded as optional meta-attributes, so zero values don't take any memory (unless they are given, but initialized to zero). See also section on meta-attributes in the reference for domain descriptors.

Notice that at the time of writing this reference only association rules can directly use examples presented in the basket format.