Centering matrix

In mathematics and multivariate statistics, the centering matrix [John I. Marden, "Analyzing and Modeling Rank Data", Chapman & Hall, 1995, ISBN 0412995212, page 59.] is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component.

Definition

The centering matrix of size "n" is defined as the "n"-by-"n" matrix: $C_n = I_n - frac{1}{n}mathbf{1}mathbf{1}'$ where $I_n,$ is the identity matrix of size "n", $mathbf{1}$ is the column-vector of "n" ones and where ${,}'$ denotes matrix transpose. For example

: $C_1 = egin{bmatrix}0 end{bmatrix}, C_2 = left [ egin{array}{rrr} frac{1}{2} & -frac{1}{2} \ \-frac{1}{2} & frac{1}{2} end{array}
ight] , C_3 = left [ egin{array}{rrr}frac{2}{3} & -frac{1}{3} & -frac{1}{3} \ \-frac{1}{3} & frac{2}{3} & -frac{1}{3} \ \-frac{1}{3} & -frac{1}{3} & frac{2}{3} end{array}
ight]$

Properties

Given a column-vector, $mathbf{v},$ of size "n", the centering property of $C_n,$ can be expressed as: $C_n,mathbf{v} = mathbf{v}-(frac{1}{n}mathbf{1}'mathbf{v})mathbf{1}$ where $frac{1}{n}mathbf{1}'mathbf{v}$ is the mean of the components of $mathbf{v},$ .

$C_n,$ is symmetric positive semi-definite.

$C_n,$ is idempotent, so that $C_n^k=C_n$ , for $k=1,2,ldots$ . Once you have removed the mean, it is zero and removing it again has no effect.

$C_n,$ is singular. The effects of applying the transformation $C_n,mathbf{v}$ cannot be reversed.

$C_n,$ has the eigenvalue 1 of multiplicity "n" − 1 and 0 of multiplicity 1.

$C_n,$ has a nullspace of dimension 1, along the vector $mathbf{1}$ .

$C_n,$ is a projection matrix. That is, $C_nmathbf{v}$ is a projection of $mathbf{v},$ onto the ("n" − 1)-dimensional subspace that is orthogonal to the nullspace $mathbf{1}$ . (This is the subspace of all "n"-vectors whose components sum to zero.)

Application

Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it forms an analytical tool that conveniently and succinctly expresses mean removal. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of a matrix. For an "m"-by-"n" matrix $X,$ , the multiplication $C_m,X$ removes the means from each of the "n" columns, while $X,C_n$ removes the means from each of the "m" rows.

The centering matrix provides in particular a succinct way to express the scatter matrix, $S=(X-mumathbf{1}')(X-mumathbf{1}')'$ of a data sample $X,$ , where $mu= frac{1}{n}Xmathbf{1}$ is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as: $S=X,C_n(X,C_n)'=X,C_n,C_n,X,'=X,C_n,X,'.$

References

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

Scatter matrix — In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix of the multivariate normal distribution. (The scatter matrix is unrelated to the scattering matrix of… … Wikipedia
Ordinary least squares — This article is about the statistical properties of unweighted linear regression analysis. For more general regression analysis, see regression analysis. For linear regression on a single variable, see simple linear regression. For the… … Wikipedia
Projection (linear algebra) — Orthogonal projection redirects here. For the technical drawing concept, see orthographic projection. For a concrete discussion of orthogonal projections in finite dimensional linear spaces, see vector projection. The transformation P is the… … Wikipedia
List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… … Wikipedia
Isomap — In statistics, isomap is one of several widely used low dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). It is used for computing a quasi… … Wikipedia
Principal component analysis — PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by… … Wikipedia
Principal components analysis — Principal component analysis (PCA) is a vector space transform often used to reduce multidimensional data sets to lower dimensions for analysis. Depending on the field of application, it is also named the discrete Karhunen Loève transform (KLT),… … Wikipedia
Gaussian adaptation — Articleissues citations missing = July 2008 COI = y expert = Mathematics notability = July 2008 jargon = July 2008 OR = September 2007 primarysources = July 2008 technical = July 2008Gaussian adaptation (GA) is an evolutionary algorithm designed… … Wikipedia
Intersectionality theory — is a term invented by Kimberle Crenshaw and utilized during the 1990s by sociologist Patricia Hill Collins. This term replaced her previously coined term black feminist thought, which increased the general applicability of her theory from African … Wikipedia
Correlation — In probability theory and statistics, correlation, (often measured as a correlation coefficient), indicates the strength and direction of a linear relationship between two random variables. In general statistical usage, correlation or co relation … Wikipedia

Academic Dictionaries and Encyclopedias

Centering matrix

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Centering matrix

Look at other dictionaries:

Share the article and excerpts

Direct link