- Computational formula for the variance
-
See also: Algorithms for calculating variance
In probability theory and statistics, the computational formula for the variance Var(X) of a random variable X is the formula
where E(X) is the expected value of X.
A closely related identity can be used to calculate the sample variance, which is often used as an unbiased estimate of the population variance:
The second result is sometimes, unwisely, used in practice to calculate the variance. The problem is that subtracting two values having a similar value can lead to catastrophic cancellation[1].
Contents
Proof
The computational formula for the population variance follows in a straightforward manner from the linearity of expected values and the definition of variance:
Generalization to covariance
This formula can be generalized for covariance, with two random variables Xi and Xj:
as well as for the n by n covariance matrix of a random vector of length n:
and for the n by m cross-covariance matrix between two random vectors of length n and m:
where expectations are taken element-wise and and are random vectors of respective lengths n and m.
Applications
Its applications in systolic geometry include Loewner's torus inequality.
See also
- ^ Donald E. Knuth (1998). The Art of Computer Programming, volume 2: Seminumerical Algorithms, 3rd edn., p. 232. Boston: Addison-Wesley.
Categories:- Statistical deviation and dispersion
Wikimedia Foundation. 2010.