CVR-Lib last update 20 Sep 2009

How to use the Classifiers.

Types of Classifiers

All classifiers follow the same principle. First, they are trained using some kind of training data. The trained classifier can then be used to classify new data, i.e. to assign the same label to the new data as to the most similar training data.

Three different categories of classifiers exist in the LTI-Lib:

The first category includes all algorithms that are trained without knowing the proper labels or results for each training pattern. Clustering algorithms are included here, although most are not designed for later classifying new data. Supervised classifiers are trained knowing the expected result for each training pattern. They are divided in two groups: sequence classifiers work on time series whereas instance classifiers work on n-dimensional vectors.

All classifiers have four essential properties:

The results of the classification are returned in an ouputVector. This object contains labels and corresponding recognition values. These can often be interpreted as probabilities. Section Output Vector gives details about this data structure.

Helper Classes

Progress Objects

The progressInfo object gives information on how many steps an algorithm will take until finished and how many have already been accomplished. Also the name of the classifier is usually given. The following progress infos exist:

Parameters

The parameters of the classifier class define an enumeration eDistanceMeasure which is used to specify which distance is used in the classifier. Options are the L1 and L2 distances.

See also cvr::classifier::parameters.

Output Vector

The output vector is the result of a classification, i.e. calling the classify method of a classifier. It assigns values to the labels. In the case of supervised classifiers these labels where supplied by the user during training. In case of unsupervised classification the classifier usually assigns labels from 0 to C-1 with C the number of classes found.

Output vectors can be the final result of a process. In this case they are usually displayed by an application or used for statistical analysis of the classification process. For the later the classificationStatistics functor can be used. It is also possible to combine the results of several classifiers using the combination functor.

For further reading see the documentation of cvr::classifier::outputVector.

Related Topics

Visualization of Classifiers and Data

Visualizing Data

cvr::draw cvr::epsDraw cvr::draw3D cvr::draw2DDistribution

Visualizing Classification Results

cvr::classifier2DVisualizer

Classification Statistics

cvr::classificationStatistics

Sammon's Mapping

Sammon's Mapping transforms points in n-dimensional space to points in m-dimensional space while trying to preserve all distances between the points. Usually, m will be 2 or 3 so that the points can be displayed using one of the cvr::draw classes and an appropriate cvr::viewer. Sammon's mapping can be very useful to get an idea of the distribution of higher dimensional data without losing as much information as when using e.g. cvr::principalComponents to reduce the dimensionality.

However, the mapping is a very difficult task and might easily fail to converge at a minimum. Check the error to get an idea about the performance. In case it is bad there are several options:

For more information see cvr::sammonsMapping.


Generated on Sun Sep 20 22:08:32 2009 for CVR-Lib by Doxygen 1.5.8