Kernel trick

Kernel trick

In machine learning, the kernel trick is a method for using a linear classifier algorithm to solve a non-linear problem by mapping the original non-linear observations into a higher-dimensional space, where the linear classifier is subsequently used; this makes a linear classification in the new space equivalent to non-linear classification in the original space.

This is done using Mercer's theorem, which states that any continuous, symmetric, positive semi-definite kernel function "K"("x", "y") can be expressed as a dot product in a high-dimensional space.

More specifically, if the arguments to the kernel are in a measurable space "X", and if the kernel is positive semi-definite — i.e.

:sum_{i,j} K(x_i,x_j) c_i c_j ge 0

for any finite subset {"x"1, ..., "x"n} of "X" and subset {"c"1, ..., "c"n} of objects (typically real numbers) — then there exists a function φ("x") whose range is in an inner product space of possibly high dimension, such that

:K(x,y) = varphi(x)cdotvarphi(y).

The kernel trick transforms any algorithm that solely depends on the dot product between two vectors. Wherever a dot product is used, it is replaced with the kernel function. Thus, a linear algorithm can easily be transformed into a non-linear algorithm. This non-linear algorithm is equivalent to the linear algorithm operating in the range space of φ. However, because kernels are used, the φ function is never explicitly computed. This is desirable, because the high-dimensional space may be infinite-dimensional (as is the case when the kernel is a Gaussian).

The kernel trick was first published by Aizerman et al. [cite journal | author = M. Aizerman, E. Braverman, and L. Rozonoer | year = 1964 | title = Theoretical foundations of the potential function method in pattern recognition learning | journal = Automation and Remote Control | volume = 25 | pages = 821–837]

It has been applied to several kinds of algorithm in machine learning and statistics, including:
* Perceptrons
* Support vector machines
* Principal components analysis
* Canonical correlation analysis
* Fisher's linear discriminant analysis
* Clustering

The origin of the term "kernel trick" is not known.fact|date=March 2008

References

ee also

* Kernel methods
* Integral transforms
* Hilbert space, specifically reproducing kernel Hilbert space
* Mercer kernel


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Kernel trick — En apprentissage automatique, le kernel trick (l astuce du noyau) est une méthode qui consiste à utiliser un classifieur linéaire pour résoudre un problème non linéaire, en transformant l espace de représentation des données d entrées en un… …   Wikipédia en Français

  • Kernel — may refer to:Computing* Kernel (computer science), the central component of most operating systems ** Linux kernel * Kernel (programming language), a Scheme like language * kernel trick, in machine learningLiterature* Kernel ( Lilo Stitch ),… …   Wikipedia

  • Kernel principal component analysis — (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non linear mapping.ExampleThe two …   Wikipedia

  • Kernel methods — (KMs) are a class of algorithms for pattern analysis, whose best known elementis the Support Vector Machine (SVM). The general task of pattern analysis is to find and study general types of relations (for example clusters, rankings, principal… …   Wikipedia

  • Kernel (mathematics) — In mathematics, the word kernel has several meanings. Kernel may mean a subset associated with a mapping:* The kernel of a mapping is the set of elements that map to the zero element (such as zero or zero vector), as in kernel of a linear… …   Wikipedia

  • Reproducing kernel Hilbert space — In functional analysis (a branch of mathematics), a reproducing kernel Hilbert space is a Hilbert space of functions in which pointwise evaluation is a continuous linear functional. Equivalently, they are spaces that can be defined by reproducing …   Wikipedia

  • Machine a vecteurs de support — Machine à vecteurs de support Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de… …   Wikipédia en Français

  • Machine À Vecteurs De Support — Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de discrimination[1] et de régression. Les SVM… …   Wikipédia en Français

  • Machine à vecteur de support — Machine à vecteurs de support Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de… …   Wikipédia en Français

  • Machine à vecteurs de support — Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de discrimination[note 1] et de régression. Les… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”