Soft independent modelling of class analogies

Soft independent modelling of class analogies

Soft independent modelling by class analogy (SIMCA) is a statistical method for supervised classification of data. The method requires a training data set consisting of samples (or objects) with a set of attributes and their class membership. The term soft refers to the fact the classifier can identify samples as belonging to multiple classes and not necessarily producing a classification of samples into non-overlapping classes.

Method

In order to build the classification models, the samples belonging to each class need to be analysed using principal components analysis (PCA); only the significant components are retained.

For a given class, the resulting model then describes either a line (for one Principal Component or PC), plane (for two PCs) or hyper-plane (for more than two PCs). For each modelled class, the mean orthogonal distance of training data samples from the line, plane or hyper-plane (calculated as the residual standard deviation) is used to determine a critical distance for classification. This critical distance is based on the F-distribution and is usually calculated using 95% or 99% confidence intervals.

New observations are projected into each PC model and the residual distances calculated. An observation is assigned to the model class when its residual distance from the model is below the statistical limit for the class. The observation may be found to belong to multiple classes and a measure of goodness of the model can be found from the number of cases where the observations are classified into multiple classes. The classification efficiency is usually indicated by Receiver operating characteristics.

In the original SIMCA method, the ends of the hyper-plane of each class are closed off by setting statistical control limits along the retained principal components axes (i.e. range: minimum score value minus 0.5 times score standard deviation to maximum score value plus 0.5 times standard deviation).

More recent adaptations of the SIMCA method close off the hyper-plane by construction of ellipsoids (e.g. Hotellings T2 or Mahalanobis distance). With such modified SIMCA methods, classification of an object requires both that its orthogonal distance from the model and its projection within the model (i.e. score value within region defined by ellipsoid) are not significant.

Application

SIMCA as a method of classification has gained widespread use especially in applied statistical fields such as chemometrics and spectroscopic data analysis.

References

* Wold, Svante, and Sjostrom, Michael, 1977, SIMCA: A method for analyzing chemical data in terms of similarity and analogy, in Kowalski, B.R., ed., Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243-282.


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • List of mathematics articles (S) — NOTOC S S duality S matrix S plane S transform S unit S.O.S. Mathematics SA subgroup Saccheri quadrilateral Sacks spiral Sacred geometry Saddle node bifurcation Saddle point Saddle surface Sadleirian Professor of Pure Mathematics Safe prime Safe… …   Wikipedia

  • Multivariate analysis — (MVA) is based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical variable at a time. In design and analysis, the technique is used to perform trade studies across… …   Wikipedia

  • SIMCA — may refer to:* Simca, a fictional character in Air Gear. * Simulator for Multithreaded Computer Architecuture * Società Italiana di Meccanica Celeste e Astrodinamica (Italian Society of Celestial Mechanics and Astrodynamics) * Société… …   Wikipedia

  • South Asian arts — Literary, performing, and visual arts of India, Pakistan, Bangladesh, and Sri Lanka. Myths of the popular gods, Vishnu and Shiva, in the Puranas (ancient tales) and the Mahabharata and Ramayana epics, supply material for representational and… …   Universalium

  • Western sculpture — ▪ art Introduction       three dimensional artistic forms produced in what is now Europe and later in non European areas dominated by European culture (such as North America) from the Metal Ages (Europe, history of) to the present.       Like… …   Universalium

  • Allosaurus — Eumetazoa Allosaurus Temporal range: Late Jurassic, 155–150 Ma …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”