Confidence region

Confidence region

In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.

The confidence region is calculated in such a way that if a set of measurements were repeated many times and a confidence region calculated in the same way on each set of measurements, then a certain percentage of the time, on average, (e.g. 95%) the confidence region would include the point representing the "true" values of the set of variables being estimated. However, it does not mean, when one confidence region has been calculated, that there is a 95% probability that the "true" values lie inside the region, since we do not assume any particular probability distribution of the "true" values and we may or may not have other information about where they are likely to lie.

Contents

The case of independent, identically normally-distributed errors

Suppose we have found a solution \boldsymbol{\beta} to the following overdetermined problem:

\mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}

where Y is an n-dimensional column vector containing observed values, X is an n-by-p matrix which can represent a physical model and which is assumed to be known exactly, \boldsymbol{\beta} is a column vector containing the p parameters which are to be estimated, and \boldsymbol{\varepsilon} is an n-dimensional column vector of errors which are assumed to be independently distributed with normal distributions with zero mean and each having the same unknown variance σ2.

A joint 100(1 - α) % confidence region for the elements of \boldsymbol{\beta} is represented by the set of values of the vector b which satisfy the following inequality:[1]

 (\boldsymbol{\beta} - \mathbf{b})^\prime \mathbf{X}^\prime\mathbf{X}(\boldsymbol{\beta} - \mathbf{b}) \le  ps^2 F_{1 - \alpha}(p,\nu) ,

where the variable b represents any point in the confidence region, p is the number of parameters, i.e. number of elements of the vector \boldsymbol{\beta}, and s2 is an unbiased estimate of σ2 equal to

s^2=\frac{\varepsilon^\prime\varepsilon}{n - p}.

Further, F is the quantile function of the F-distribution, with p and ν = np degrees of freedom, α is the statistical significance level, and the symbol X^\prime means the transpose of X.

The above inequality defines an ellipsoidal region in the p-dimensional Cartesian parameter space Rp. The centre of the ellipsoid is at the solution \boldsymbol\beta. According to Press et al., it's easier to plot the ellipsoid after doing singular value decomposition. The lengths of the axes of the ellipsoid are proportional to the reciprocals of the values on the diagonals of the diagonal matrix, and the directions of these axes are given by the rows of the 3rd matrix of the decomposition.

Weighted and generalised least squares

Now let us consider the more general case where some distinct elements of \boldsymbol{\varepsilon} have known nonzero covariance (in other words, the errors in the observations are not independently distributed), and/or the standard deviations of the errors are not all equal. Suppose the covariance matrix of \boldsymbol{\varepsilon} is \mathbf{V}\sigma^2, where V is an n-by-n nonsingular matrix which was equal to \mathbf{I}\sigma^2 in the more specific case handled in the previous section, (where I is the identity matrix,) but here is allowed to have nonzero off-diagonal elements representing the covariance of pairs of individual observations, as well as not necessarily having all the diagonal elements equal.

It is possible to find[2] a nonsingular symmetric matrix P such that

\mathbf{P}^\prime\mathbf{P} = \mathbf{P}\mathbf{P} = \mathbf{V}

In effect, P is a square root of the covariance matrix V.

The least-squares problem

\mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}

can then be transformed by left-multiplying each term by the inverse of P, forming the new problem formulation

\mathbf{Z} = \mathbf{Q}\boldsymbol{\beta} + \mathbf{f} ,

where

\mathbf{Z} = \mathbf{P}^{-1}\mathbf{Y}
\mathbf{Q} = \mathbf{P}^{-1}\mathbf{X} and
\mathbf{f} = \mathbf{P}^{-1}\boldsymbol{\varepsilon}

A joint confidence region for the parameters, i.e. for the elements of \boldsymbol{\beta}, is then bounded by the ellipsoid given by:[3]

 (\mathbf{b} - \boldsymbol{\beta})^\prime \mathbf{Q}^\prime\mathbf{Q}(\mathbf{b} - \boldsymbol{\beta}) = {\frac{p}{n - p}} (\mathbf{Z}^\prime\mathbf{Z}
- \mathbf{b}^\prime\mathbf{Q}^\prime\mathbf{Z})F_{1 - \alpha}(p,n-p).

Here F represents the percentage point of the F distribution and the quantities p and n-p are the degrees of freedom which are the parameters of this distribution.

Nonlinear problems

Confidence regions can be defined for any probability distribution. The experimenter can choose the significance level and the shape of the region, and then the size of the region is determined by the probability distribution. A natural choice is to use as a boundary a set of points with constant χ2 (chi-squared) values.

One approach is to use a linear approximation to the nonlinear model, which may be a close approximation in the vicinity of the solution, and then apply the analysis for a linear problem to find an approximate confidence region. This may be a reasonable approach if the confidence region is not very large and the second derivatives of the model are also not very large.

See also

Notes

  1. ^ Draper and Smith (1981,p. 94)
  2. ^ Draper and Smith (1981,p. 108)
  3. ^ Draper and Smith (1981, p. 109)

References

  • Draper, N.R.; H. Smith (1981) [1966]. Applied Regression Analysis (2nd ed.). USA: John Wiley and Sons Ltd. ISBN 0471029955. 
  • Press, W.H.; S.A. Teukolsky, W.T. Vetterling, B.P. Flannery (1992) [1988]. Numerical Recipes in C: The Art of Scientific Computing (2nd ed.). Cambridge UK: Cambridge University Press. 

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Confidence interval — This article is about the confidence interval. For Confidence distribution, see Confidence Distribution. In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the… …   Wikipedia

  • Confidence-building measures in Central America — Confidence building measures (CBMs) were a key element in the Central American peace process. Although CBMs have always existed in some form or another in the hemisphere s conflict situations, the Central American peace process for the first time …   Wikipedia

  • Confidence-building measures in South America — The South American experience with confidence building measures has been markedly different from the Central American one for the obvious reason that South America did not live through the protracted conflict and peacemaking process which… …   Wikipedia

  • Directorate of the Klaipėda Region — Simonaitis Directorate, organized to facilitate the Klaipėda Revolt in January 1923 The Directorate of the Klaipėda Region (German: Landesdirektorium; Lithuanian: Klaipėdos krašto direktorija) was the main governing institution (executive branch) …   Wikipedia

  • Consumer Confidence Index — Durchschnitt der Konsumklimaindizes des Conference Board (grün), der University of Michigan (gelb) und der Washington Post (rot) von 2004 bis 2009 Der Consumer Confidence Index (CCI, Index für das Verbrauchervertrauen) misst die Konsumneigung der …   Deutsch Wikipedia

  • Consumer Confidence Index — The U.S. Consumer Confidence Index (CCI) is an indicator designed to measure consumer confidence, which is defined as the degree of optimism on the state of the economy that consumers are expressing through their activities of savings and… …   Wikipedia

  • Parliament of the Brussels-Capital Region — French: Parlement de la Région de Bruxelles Capitale Dutch: Parlement van het Brusselse Hoofdstedelijke Gewest …   Wikipedia

  • Marlborough Region — This article is about Marlborough, a region of the South Island of New Zealand. For other places called Marlborough, see Marlborough (disambiguation). Marlborough Region Location Marlborough in relation to New Zealand …   Wikipedia

  • Central Otago Wine Region — is a sheltered inland area with a continental microclimate characterised by hot, dry summers, short, cool autumns and crisp, cold winters. Central Otago is in the process of applying for a geographic indication for wines grown in the area. This… …   Wikipedia

  • Parliament of the Brussels Capital-Region — The Parliament of the Brussels Capital Region, or Brussels Regional Parliament (French: Parliament de la Région de Bruxelles Capitale or Parlement Bruxellois , Dutch: Parlement van het Brusselse Hoofdstedelijke Gewest or Brussels Hoofdstedelijk… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”