Large deviations theory

Large deviations theory

In Probability Theory, the Large Deviations Theory concerns the asymptotic behaviour of remote tails of sequences of probability distributions. Some basic ideas of the theory can be tracked back to Laplace and Cramér, although a clear unified formal definition was introduced in 1966 by Varadhan [S.R.S. Varadhan, "Asymptotic probability and differential equations", Comm. Pure Appl. Math. 19 (1966),261-286.] . Large Deviations Theory formalizes the heuristic ideas of "concentration of measures" and widely generalizes the notion of convergence of probability measures.

Roughly speaking, Large Deviation Theory concerns itself with the exponential decay of the probability measures of certain kinds of extreme or "tail" events, as the number of observations grows arbitrarily large.

Introductory examples

An elementary example

Consider a sequence of independent tosses of a faircoin. The possible outcomes could be head or tail. Let us denote the possible outcome of the i-th trial by X_i , where we encode head as -1 and tail as 1. Now let M_N denote the mean value after N trials, namely

: M_N := frac{1}{N}sum_{i=1}^{N} X_i

Then M_N lies between -1 and 1. From the law of large numbers (and alsofrom our experience) we know that as N become larger and larger, M_N becomes closer and closer to 0 with increasing probability. Let us make this statement more precise. For a given value x>0 , let us compute the probability P(M_N > x) that M_N is greater than x . By the Chernoff inequality it can be shown that P(M_N > x) < exp(-x^2N/2) . This bound is rather sharp, in a suitable technical sense.In other words the probability P(M_N > x) is decaying exponentially rapidly as N grows large, at a rate depending on x.

Large Deviations for sums of independent random variables

In the above mentioned example of coin-tossing we tacitly assumed that each toss is an independent trial. And for each toss, the probability of getting head or tail is always thesame. This makes the random numbers X_i independent and identically distributed (i.i.d.). For i.i.d. variableswhose common distribution satisfies a certain growth condition, large deviation theory states that the following limit exists:

:lim_{N o infty} frac{1}{N} log P(M_N > x) = - I(x)

The function I(x) is called the "rate function" or "Cramer function" or sometimes the "entropy function". Roughly speaking, the existence of this limit is what establishes the above mentioned exponential decay and allows us to conclude that for large N, P(M_N >x) takes the form:

: P(M_N >x) approx exp [-NI(x) ] .

which is the basic result of Large Deviations Theory in this setting. Note that the inequality given in the first paragraph, as opposed to the asymptotic formula presented here, requires an additional argument.

If we know the probability distribution of X_i , an explicit expression for the rate function can be obtained. This is given by a
Legendre transform

:I(x) = sup_{ heta > 0} [ heta x - lambda( heta)]

where the function lambda( heta) is called the cumulant generating function (CGF), given by

: lambda( heta) = log E [exp( heta X)] , .

Here E [,] denotes expectation value with respect to the probability distribution function of X_i and X is any one of X_i s. If X_i follows a Gaussian distribution,the rate function becomes a parabola with its apex at the mean of the Gaussiandistribution.

If the condition of Independent Identical Distribution is relaxed, particularly if the numbers X_i are not independent but nevertheless satisfy the Markov Property, the basic large deviations result stated above can be generalized.

Formal Definition

Given a Polish space X let { mathbb{P}_N} be a sequence of Borel probability measures on X, let {a_N} be a sequence of positive real numbers such that lim_N a_N=+infty, and finally let I:X o [0,+infty] be a lower semicontinuous functional on X. The sequence { mathbb{P}_N} is said to satisfy a Large deviation principle with "speed" {a_n} and "rate" I, iff for each Borel measurable set E subset X

: -inf_{x in E^circ} I(x) le varliminf_N a_N^{-1} logig(mathbb{P}_N(E)ig) le varlimsup_N a_N^{-1} logig(mathbb{P}_N(E)ig) le -inf_{x in ar{E I(x)

where ar{E} and E^circ denote respectively the closure and interior of E.

Brief History

The first rigorous results concerning Large Deviations are due to the Swedish mathematician Harald Cramér, who applied them to model the insurance business. From the pointof view of an insurance company, the earning is at a constant rate per month(the monthly premium) but the claims come randomly. For the company to be successful over a certain period of time (preferably many months), the total earning shouldexceed the total claim. Thus to estimate the premium you have to ask the followingquestion : "What should we choose as the premium q such that over N months the total claim C = Sigma X_i shouldbe less than Nq ? " This is clearly the same question asked by the large deviations theory. Cramer gave a solution to this question for i.i.d. gaussian random variables, where the rate function is expressed as a power series.The results we have quoted above were later obtained by Chernoff, among other people. A very incomplete list of mathematicians who have made important advances would include S.R.S. Varadhan (who has won the Abel prize), D. Ruelle and O.E. Lanford.

Applications

Establishing Large Deviations Principles is one of the most effective ways to gather information out of a probabilistic model. Some of the best known applications of Large Deviation Theory rise in Statistical Mechanics, Quantum Mechanics, Information Theory and Risk Management.

Applications to Statistical Mechanics: Large Deviation and Entropy

The rate function is related to the entropy in statistical mechanics. This can be heuristically seenin the following way. In statistical mechanics the entropy of a particular macro-state is relatedto the number of micro-states which corresponds to this macro-state. In our coin tossing example themean value M_N could designate a particular macro-state. And the particular sequence ofheads and tails which gives rise to a particular value of M_N constitutes a particularmicro-state. Loosely speaking a macro-state having more number of micro-states giving rise to it,has higher entropy. And a state with higher entropy has more chance of being realised in actualexperiments. The macro-state with mean value of zero (as many heads as tails) has the highest number micro-states giving rise to it and it is indeed the state with the highest entropy. And in most practical situationwe shall indeed obtain this macro-state for large number of trials. The "rate function" on the otherhand measures the probability of appearance of a particular macro-state. The smaller the rate functionthe higher is the chance of a macro-state appearing. In our coin-tossing the value of the "rate function" for mean valueequal to zero is zero. In this way one can see the "rate function" as the negative of the "entropy".

References

Bibliography

* Entropy, Large Deviations and Statistical Mechanics by R.S. Ellis, Springer Publication. ISBN 3-540-29059-1
* Large Deviations for Performance Analysis by Alan Weiss and Adam Shwartz. Chapman and Hall ISBN 0-412-06311-5
* Large Deviations Techniques and Applications by Amir Dembo and Ofer Zeitouni. Springer ISBN 0-387-98406-2
* Random Perturbations of Dynamical Systems by M.I. Freidlin and A.D. Wentzell. Springer ISBN 0-387-98362-7

See also

* Chernoff's inequality
* Contraction principle (large deviations theory), a result on how large deviations principles "push forward"
* Freidlin-Wentzell theorem, a large deviations principle for Itō diffusions
* Laplace principle, a large deviations principle in R"d"
* Schilder's theorem, a large deviations principle for Brownian motion
* Varadhan's lemma
* Extreme value theory

External links

* [http://www.cl.cam.ac.uk/Research/SRG/netos/old-projects/measure/tutorial/rev-tutorial.ps.gz An elementary introduction to the Large Deviations Theory]
* [http://www.abelprisen.no/en/prisvinnere/2007/ Abel Prize 2007 awarded to S.R.S. Varadhan]


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Contraction principle (large deviations theory) — In mathematics specifically, in large deviations theory the contraction principle is a theorem that states how a large deviation principle on one space pushes forward to a large deviation principle on another space via a continuous function.… …   Wikipedia

  • Laplace principle (large deviations theory) — In mathematics, Laplace s principle is a basic theorem in large deviations theory, similar to Varadhan s lemma. It gives an asymptotic expression for the Lebesgue integral of exp(− theta; phi; ( x )) over a fixed set A as theta; becomes large.… …   Wikipedia

  • Large deviations of Gaussian random functions — A random function ndash; of either one variable (a random process), or two or more variables(a random field) ndash; is called Gaussian if every finite dimensional distribution is a multivariate normal distribution. Gaussian random fields on the… …   Wikipedia

  • Principe de grandes déviations — Le principe de grandes déviations, en théorie des probabilités, concerne le comportement asymptotique de queues de suite de loi de probabilités. Quelques premières idées de la théorie ont été données par Laplace et Cramér ; depuis, une… …   Wikipédia en Français

  • Tilted large deviation principle — In mathematics mdash; specifically, in large deviations theory mdash; the tilted large deviation principle is a result that allows one to generate a new large deviation principle from an old one by tilting , i.e. integration against an… …   Wikipedia

  • Large numbers — This article is about large numbers in the sense of numbers that are significantly larger than those ordinarily used in everyday life, for instance in simple counting or in monetary transactions. The term typically refers to large positive… …   Wikipedia

  • control theory — Field of applied mathematics relevant to the control of certain physical processes and systems. It became a field in its own right in the late 1950s and early 60s. After World War II, problems arising in engineering and economics were recognized… …   Universalium

  • Perturbation theory (quantum mechanics) — In quantum mechanics, perturbation theory is a set of approximation schemes directly related to mathematical perturbation for describing a complicated quantum system in terms of a simpler one. The idea is to start with a simple system for which a …   Wikipedia

  • Info-gap decision theory — is a non probabilistic decision theory that seeks to optimize robustness to failure – or opportuneness for windfall – under severe uncertainty,[1][2] in particular applying sensitivity analysis of the stability radius type[3] to perturbations in… …   Wikipedia

  • Real Business Cycle Theory — (or RBC Theory) is a class of macroeconomic models in which business cycle fluctuations to a large extent can be accounted for by real (in contrast to nominal) shocks. (The four primary economic fluctuations are secular (trend), business cycle,… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”