Omitted-variable bias

In statistics, omitted-variable bias (OVB) occurs when a model is created which incorrectly leaves out one or more important causal factors. The 'bias' is created when the model compensates for the missing factor by over- or under-estimating one of the other factors.

More specifically, OVB is the bias that appears in the estimates of parameters in a regression analysis, when the assumed specification is incorrect, in that it omits an independent variable (possibly non-delineated) that should be in the model.

Omitted-variable bias in linear regression

Two conditions must hold true for omitted-variable bias to exist in linear regression:

the omitted variable must be a determinant of the dependent variable (i.e., its true regression coefficient is not zero); and
the omitted variable must be correlated with one or more of the included independent variables.

As an example, consider a linear model of the form

$y_i = x_i \beta + z_i \delta + u_i,\qquad i = 1,\dots,n$

where

x_i is a 1 × p row vector, and is part of the observed data;
β is a p × 1 column vector of unobservable parameters to be estimated;
z_i is a scalar and is part of the observed data;
δ is a scalar and is an unobservable parameter to be estimated;
the error terms u_i are unobservable random variables having expected value 0 (conditionally on x_i and z_i);
the dependent variables y_i are part of the observed data.

We let

$X = \left[ \begin{array}{c} x_1 \\ \vdots \\ x_n \end{array} \right] \in \mathbb{R}^{n\times p},$

and

$Y = \left[ \begin{array}{c} y_1 \\ \vdots \\ y_n \end{array} \right],\quad Z = \left[ \begin{array}{c} z_1 \\ \vdots \\ z_n \end{array} \right],\quad U = \left[ \begin{array}{c} u_1 \\ \vdots \\ u_n \end{array} \right] \in \mathbb{R}^{n\times 1}.$

Then through the usual least squares calculation, the estimated parameter vector $\hat{\beta}$ based only on the observed x-values but omitting the observed z values, is given by:

$\hat{\beta} = (X'X)^{-1}X'Y\,$

(where the "prime" notation means the transpose of a matrix).

Substituting for Y based on the assumed linear model,

$\begin{align} \hat{\beta} & = (X'X)^{-1}X'(X\beta+Z\delta+U) \\ & =(X'X)^{-1}X'X\beta + (X'X)^{-1}X'Z\delta + (X'X)^{-1}X'U \\ & =\beta + (X'X)^{-1}X'Z\delta + (X'X)^{-1}X'U. \end{align}$

On taking expectations, the contribution of the final term is zero; this follows from the assumption that U has zero expectation. On simplifying the remaining terms:

$\begin{align} E[ \hat{\beta} | X ] & = \beta + (X'X)^{-1}X'Z\delta \\ & = \beta + \text{bias}. \end{align}$

The second term above is the omitted-variable bias in this case. Note that the bias is equal to the weighted portion of z_i which is "explained" by x_i.

Effects on Ordinary Least Square

Gauss–Markov theorem states that regression models which fulfill the classical linear regression model assumptions provide the best, linear and unbiased estimators. With respect to ordinary least squares, the relevant assumption of the classical linear regression model is that the error term is uncorrelated with the regressors.

The presence of omitted variable bias violates this particular assumption. The violation causes OLS estimator to be biased and inconsistent. The direction of the biased depends on the estimators as well as the covariance between the regressors and the omitted variables. Given a positive estimator, a positive covariance will lead OLS estimator to overestimate the true value of an estimator. This effect can be seen by taking the expectation of the parameter, as shown in the previous section.

References

Greene, WH (1993). Econometric Analysis, 2nd ed.. Macmillan. pp. 245–246.
Barreto and Howland (2005). Introductory Econometrics: Using Monte Carlo Simulation with Microsoft Excel. Cambridge University Press. http://www3.wabash.edu/econometrics/EconometricsBook/chap18.htm.

v · d · eBiases

Cognitive bias	Confirmation bias · Correspondence bias · Hindsight bias · Memory bias · Motivated reasoning · Outcome bias · Publication bias · Status quo bias

Statistical bias	Ascertainment bias · Bias of an estimator · Information bias · Lead time bias · Observer bias · Omitted-variable bias · Recall bias · Response bias · Sampling bias · Selection bias · Systematic bias · Systemic bias

Other/ungrouped	FUTON bias · No abstract available bias

Categories:

Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

Bias (statistics) — In statistics, the term bias is used for describing several different concepts: * A biased sample is one in which some members of the population are more likely to be included than others. **Spectrum bias refers to evaluating the ability of a… … Wikipedia
Bias — This article is about different ways the term bias is used . For other uses, see Bias (disambiguation). Bias is an inclination to present or hold a partial perspective at the expense of (possibly equally valid) alternatives. Bias can come in many … Wikipedia
Bias of an estimator — In statistics, the difference between an estimator s expected value and the true value of the parameter being estimated is called the bias. An estimator or decision rule having nonzero bias is said to be biased.Although the term bias sounds… … Wikipedia
Experimenter's bias — In experimental science, experimenter s bias is subjective bias towards a result expected by the human experimenter. David Sackett,[1] in a useful review of biases in clinical studies, states that biases can occur in any one of seven stages of… … Wikipedia
Selection bias — is a statistical bias in which there is an error in choosing the individuals or groups to take part in a scientific study.[1] It is sometimes referred to as the selection effect. The term selection bias most often refers to the distortion of a… … Wikipedia
Confirmation bias — (also called confirmatory bias or myside bias) is a tendency for people to favor information that confirms their preconceptions or hypotheses regardless of whether the information is true.[Note 1][1] As a result, people gather evidence and recall … Wikipedia
Sampling bias — In statistics, sampling bias is when a sample is collected in such a way that some members of the intended population are less likely to be included than others. It results in a biased sample, a non random sample[1] of a population (or non human… … Wikipedia
Cognitive bias — For an article about the conceptual problems of the mind see Cognitive closure (philosophy). Psychology … Wikipedia
Outcome bias — The outcome bias is an error made in evaluating the quality of a decision when the outcome of that decision is already known. Overview One will often judge a past decision by its ultimate outcome instead of based on the quality of the decision at … Wikipedia
Moderator variable — A moderator variable is, in general terms, a qualitative (e.g., sex, race, class) or quantitative (e.g., level of reward) variable that affects the direction and/or strength of the relation between dependent and independent variables.… … Wikipedia

Academic Dictionaries and Encyclopedias

Omitted-variable bias