Statistical independence


Statistical independence

In probability theory, to say that two events are independent, intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs. For example:

* The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are "independent".
* By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trials is 8 are "dependent".
* If two cards are drawn "with" replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are "independent".
* By contrast, if two cards are drawn "without" replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are "dependent".

Similarly, two random variables are independent if the conditional probability distribution of either given the observed value of the other is the same as if the other's value had not been observed.

Independent events

The standard definition says:

:Two events "A" and "B" are independent if and only if Pr("A" ∩ "B") = Pr("A")Pr("B").

Here "A" ∩ "B" is the intersection of "A" and "B", that is, it is the event that both events "A" and "B" occur.

More generally, any collection of events -- possibly more than just two of them -- are mutually independent if and only if for any finite subset "A"1, ..., "A""n" of the collection we have

:Prleft(igcap_{i=1}^n A_i ight)=prod_{i=1}^n Pr(A_i). !,

This is called the "multiplication rule" for independent events.

If two events "A" and "B" are independent, then the conditional probability of "A" given "B" is the same as the unconditional (or marginal) probability of "A", that is,

:Pr(Amid B)=Pr(A). !,

There are at least two reasons why this statement is not taken to be the definition of independence: (1) the two events "A" and "B" do not play symmetrical roles in this statement, and (2) problems arise with this statement when events of probability 0 are involved.

The conditional probability of event A given B is given by

:Pr(Amid B)={Pr(A cap B) over Pr(B)}, !, (so long as Pr("B") ≠ 0 )

The statement above is equivalent to

:Pr(A cap B)=Pr(A)Pr(B) !,

which is the standard definition given above.

Note that independence does not have the same meaning as it does in the vernacular. For example an event is independent of itself if and only if

:Pr(A) = Pr(A cap A) = Pr(A)Pr(A).

That is, if its probability is one or zero. Thus if an event or its complement almost surely occurs, it is independent of itself. For example, if event "A" is choosing any number but 0.5 from a uniform distribution on the unit interval, "A" is independent of itself, even though, tautologically, "A" fully determines "A".

Independent random variables

What is defined above is independence of "events". In this section we treat independence of random variables. If "X" is a real-valued random variable and "a" is a number then the event "{X ≤ a}" is the set of outcomes that correspond to "X" being less than or equal to "a".As these are sets of outcomes that have probabilities, it makes sense to referring to events of this sort being independent of other events of this sort.

Two random variables "X" and "Y" are independent if and only if for any numbers "a" and "b" the events "{X ≤ a}" (the outcomes where "X" being less than or equal to "a") and "{Y ≤ b}" are independent events as defined above. Similarly an arbitrary collection of random variables -- possible more than just two of them -- is independent precisely if for any finite collection "X"1, ..., "X""n" and any finite set of numbers "a"1, ..., "a""n", the events "{X1 ≤ a1},..., {Xn ≤ an}" are independent events as defined above.

The measure-theoretically inclined may prefer to substitute events "{X ∈ A}" for events "{X ≤ a}" in the above definition, where "A" is any Borel set. That definition is exactly equivalent to the one above when the values of the random variables are real numbers. It has the advantage of working also for complex-valued random variables or for random variables taking values in any topological space.

If any two of a collection of random variables are independent, they may nonetheless fail to be mutually independent; this is called pairwise independence.

If "X" and "Y" are independent, then the expectation operator "E" has the nice property:E ["X" "Y"] = E ["X"] E ["Y"] ,and for the variance we have:var("X" + "Y") = var("X") + var("Y"),so the covariance cov("X","Y") is zero.(The converse of these, i.e. the proposition that if two random variables have a covariance of 0 they must be independent, is not true. See uncorrelated.)

Furthermore, random variables "X" and "Y" with distribution functions "F""X"("x") and "F""Y"("y"), and probability densities "f""X"("x") and "f""Y"("y"), are independent if and only if the combined random variable ("X","Y") has a joint distribution

::F_{X,Y}(x,y) = F_X(x) F_Y(y),

or equivalently, a joint density

::f_{X,Y}(x,y) = f_X(x) f_Y(y).

Similar expressions characterise independence more generally for more than two random variables.

Conditionally independent random variables

Intuitively, two random variables "X" and "Y" are conditionally independent given "Z" if, once "Z" is known, the value of "Y" does not add any additional information about "X". For instance, two measurements "X" and "Y" of the same underlying quantity "Z" are not independent, but they are conditionally independent given "Z" (unless the errors in the two measurements are somehow connected).

The formal definition of conditional independence is based on the idea of conditional distributions. If "X", "Y", and "Z" are discrete random variables, then we define "X" and "Y" to be "conditionally independent given" "Z" if

:mathrm{P}(X le x, Y le y;|;Z = z) = mathrm{P}(X le x;|;Z = z) cdot mathrm{P}(Y le y;|;Z = z)

for all "x", "y" and "z" such that mathrm{P}(Z le z) > 0. On the other hand, if the random variables are continuous and have a joint probability density function "p", then "X" and "Y" are conditionally independent given "Z" if

:p_{XY|Z}(x, y | z) = p_{X|Z}(x | z) cdot p_{Y|Z}(y | z)

for all real numbers "x", "y" and "z" such that p_Z(z) > 0.

If "X" and "Y" are conditionally independent given "Z", then:mathrm{P}(X = x | Y = y, Z = z) = mathrm{P}(X = x | Z = z)for any "x", "y" and "z" with mathrm{P}(Z = z) > 0. That is, the conditional distribution for "X" given "Y" and "Z" is the same as that given "Z" alone. A similar equation holds for the conditional probability density functions in the continuous case.

Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.

See also

* Copula (statistics)
* Independent and identically-distributed random variables


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • statistical independence — Statistics. the condition or state of events or values of being statistically independent. Also called stochastic independence. * * * statistical independence, Statistics. the absence of correlation between two or more ways of classifying a group …   Useful english dictionary

  • statistical independence — Statistics. the condition or state of events or values of being statistically independent. Also called stochastic independence. * * * …   Universalium

  • statistical independence — Two variables (i and j) are statistically independent if the proportion who are both i and j (pij) is equal to the proportion who are i multiplied by the proportion who are (pij = pi × pj or± pij pi × pj = 0). This principle forms the basis of… …   Dictionary of sociology

  • statistical independence — /stəˌtɪstɪkəl ɪndəˈpɛndəns/ (say stuh.tistikuhl induh penduhns) noun Statistics a condition on the two way probability distribution of two variables such that the conditional probability distribution of one variable for a given value of a second… …   Australian English dictionary

  • Statistical assumption — Statistical assumptions are general assumptions about statistical populations.Statistics, like all mathematical disciplines, does not generate valid conclusions from nothing. In order to generate interesting conclusions about real statistical… …   Wikipedia

  • Independence (disambiguation) — Independence is the self government of a nation, country, or state by its residents and population.Independence may also mean:;In mathematics: *Independence (mathematical logic), Logical independence *Linear independence *Statistical independence …   Wikipedia

  • Statistical hypothesis testing — This article is about frequentist hypothesis testing which is taught in introductory statistics. For Bayesian hypothesis testing, see Bayesian inference. A statistical hypothesis test is a method of making decisions using data, whether from a… …   Wikipedia

  • independence — See statistical independence …   Dictionary of sociology

  • Statistical Analysis Center — Statistical Analysis Centers (SACs) are state agencies created by legislation or Executive Order that collect, analyze, and disseminate criminal and juvenile justice data. They contribute to effective state policies through statistical services,… …   Wikipedia

  • Statistical Methods for Research Workers — (ISBN 0 05 002170 2) is a classic 1925 book on statistics by the statistician R.A. Fisher. It is considered by some to be one of the 20th century s most influential books on statistical methods. Chapters* Prefaces # Introduction # Diagrams #… …   Wikipedia


Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.