Centering matrix

Centering matrix

In mathematics and multivariate statistics, the centering matrix [John I. Marden, "Analyzing and Modeling Rank Data", Chapman & Hall, 1995, ISBN 0412995212, page 59.] is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component.

Definition

The centering matrix of size "n" is defined as the "n"-by-"n" matrix:C_n = I_n - frac{1}{n}mathbf{1}mathbf{1}'where I_n, is the identity matrix of size "n", mathbf{1} is the column-vector of "n" ones and where {,}' denotes matrix transpose. For example

:C_1 = egin{bmatrix}0 end{bmatrix}, C_2 = left [ egin{array}{rrr} frac{1}{2} & -frac{1}{2} \ \-frac{1}{2} & frac{1}{2} end{array} ight] , C_3 = left [ egin{array}{rrr}frac{2}{3} & -frac{1}{3} & -frac{1}{3} \ \-frac{1}{3} & frac{2}{3} & -frac{1}{3} \ \-frac{1}{3} & -frac{1}{3} & frac{2}{3} end{array} ight]

Properties

Given a column-vector, mathbf{v}, of size "n", the centering property of C_n, can be expressed as:C_n,mathbf{v} = mathbf{v}-(frac{1}{n}mathbf{1}'mathbf{v})mathbf{1}where frac{1}{n}mathbf{1}'mathbf{v} is the mean of the components of mathbf{v},.

C_n, is symmetric positive semi-definite.

C_n, is idempotent, so that C_n^k=C_n, for k=1,2,ldots. Once you have removed the mean, it is zero and removing it again has no effect.

C_n, is singular. The effects of applying the transformation C_n,mathbf{v} cannot be reversed.

C_n, has the eigenvalue 1 of multiplicity "n" − 1 and 0 of multiplicity 1.

C_n, has a nullspace of dimension 1, along the vector mathbf{1}.

C_n, is a projection matrix. That is, C_nmathbf{v} is a projection of mathbf{v}, onto the ("n" − 1)-dimensional subspace that is orthogonal to the nullspace mathbf{1}. (This is the subspace of all "n"-vectors whose components sum to zero.)

Application

Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it forms an analytical tool that conveniently and succinctly expresses mean removal. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of a matrix. For an "m"-by-"n" matrix X,, the multiplication C_m,X removes the means from each of the "n" columns, while X,C_n removes the means from each of the "m" rows.

The centering matrix provides in particular a succinct way to express the scatter matrix, S=(X-mumathbf{1}')(X-mumathbf{1}')' of a data sample X,, where mu= frac{1}{n}Xmathbf{1} is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as:S=X,C_n(X,C_n)'=X,C_n,C_n,X,'=X,C_n,X,'.

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Scatter matrix — In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix of the multivariate normal distribution. (The scatter matrix is unrelated to the scattering matrix of… …   Wikipedia

  • Ordinary least squares — This article is about the statistical properties of unweighted linear regression analysis. For more general regression analysis, see regression analysis. For linear regression on a single variable, see simple linear regression. For the… …   Wikipedia

  • Projection (linear algebra) — Orthogonal projection redirects here. For the technical drawing concept, see orthographic projection. For a concrete discussion of orthogonal projections in finite dimensional linear spaces, see vector projection. The transformation P is the… …   Wikipedia

  • List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… …   Wikipedia

  • Isomap — In statistics, isomap is one of several widely used low dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). It is used for computing a quasi… …   Wikipedia

  • Principal component analysis — PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by… …   Wikipedia

  • Principal components analysis — Principal component analysis (PCA) is a vector space transform often used to reduce multidimensional data sets to lower dimensions for analysis. Depending on the field of application, it is also named the discrete Karhunen Loève transform (KLT),… …   Wikipedia

  • Gaussian adaptation — Articleissues citations missing = July 2008 COI = y expert = Mathematics notability = July 2008 jargon = July 2008 OR = September 2007 primarysources = July 2008 technical = July 2008Gaussian adaptation (GA) is an evolutionary algorithm designed… …   Wikipedia

  • Intersectionality theory — is a term invented by Kimberle Crenshaw and utilized during the 1990s by sociologist Patricia Hill Collins. This term replaced her previously coined term black feminist thought, which increased the general applicability of her theory from African …   Wikipedia

  • Correlation — In probability theory and statistics, correlation, (often measured as a correlation coefficient), indicates the strength and direction of a linear relationship between two random variables. In general statistical usage, correlation or co relation …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”