The AIAI CBR Shell is a generic tool for case-based reasoning. The tool performs classification based on case comparison. The parameters of the algorithm can be varied: the number of nearest neighbours considered can be specified, the weights can be set manually, or the weights can be optimised by a genetic algorithm. The accuracy of the algorithm is measured by a leave-one-out evaluation.
A case is simply a comma delimited list of values (numbers or
strings), one of which denotes the class to which the case
belongs. E.g. the Iris data has 5 fields, the final being the class:
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
The CBR Shell is made available without warranty or guarantee for academic purposes only.
Download and Installation
The CBR Shell is now available as an Applet - so no installation is required. Please note that your browser must use Java 1.4 (the applet is tested on Mozilla 1.2 on Linux, Explorer 6.0 on Windows XP, and Explorer 5.2 on Mac OS). If you can put your casebase files in a web-accessible area, then simply use that URL in place of the default URL that appears in the 'Load URL' window.If you wish run the CBR as an application, then download the jar file: click here. To run under Linux or Mac OS X, type java -jar cbrapplet.jar under XP open with javaw or from a shell.
You may wish to copy the files from the table below (shift-click to download, or use the `Save Page As' option in Netscape 6 and use the files names given.
Iris casebase: | iris-casebase |
Key file for Iris casebase: | iris.key |
Cancer casebase: | cancer-casebase |
Key file for Cancer casebase: | cancer.key |
Zoo casebase: | zoo-casebase |
Key file for Zoo casebase: | zoo.key |
Using the shell:
Load a casebase file (the key is automatically loaded,but
must be in the same directory).
Step presents the results of each case analysis - the
screen should appear as shown below after the casebase is loaded and Step
is executed:
Run goes through the casebase and calculates the accuracy.
CBRSettings: set K or threshold, select weight structure:
Genetic Algorithm: the GA optimises the weight structure.
Each chromosome represents the weights 1..N, as applicable to a casebase
of cases each having N fields. The GA does not alter the value of k.
Initially, set the no. of chromosomes to a small value e.g. 20.
Run setsup and starts the GA. The results of each generation are printed
to the command line.
The mapping from bits to weights can be configured.
The default number of bits is 2, giving a total of 2**bits = 2**2 = 4 weight values.
The weight values must be listed in the Mapping text field.
The bit values 00-11 correspond to 4 weights:
00->weight-1, 01->weight-2, 10->weight-3,
11->weight-4
In the example above,
00->0.0, 01->1.0, 10->2.0,11->4.0
meaning that 00 in the chromosome is decoded as a weight of 0.0, 01 is
decoded as 1.0 and so on. The CBR is run once, using these parameters,
to evaluate each chromosome. The usual crossover and mutation
operators are applied.
NOTE: there may be problems with this option under windows.
Casebase and Key Files
The Casebase Data: the data must be in comma delimited form, newline delimits a case. The first line of the casebase must contain the name of the key file, the second states the goalfield (i.e. the class to which the data/record/case belongs). See the example files.
The Key: this file defines the type of matching that is done on
each field in each case. The matching types include:
num - numerical comparison by evaluating the ratio of 2 numbers
stringexact - string comarison (equality test)
trigram - comparison of strings/sentences/paragraphs by trigram
matching
The Iris key is simply:
num weight: 1.0
num weight: 1.0
num weight: 1.0
num weight: 1.0
stringexact weight: 1.0
meaning there are 4 numerical fields, and a string field. All weights
are 1.0. In fact, the final field is the goal field and is always
treated as a string (the weight is ignored).
Contact: Stuart Aitken
Email: s.aitken@ed.ac.uk