Geometric distribution

Geometric distribution

Probability distribution two
name =Geometric
type =mass
pdf_

cdf_
| parameters =0< p leq 1 success probability (real)
support =k in {1,2,3,dots}!
pdf =(1 - p)^{k-1},p!
cdf =1-(1 - p)^k!
mean =frac{1}{p}!
median =leftlceil frac{-log(2)}{log(1-p)} ight ceil! (not unique if -log(2)/log(1-p) is an integer)
mode =1
variance =frac{1-p}{p^2}!
skewness =frac{2-p}{sqrt{1-p
!
kurtosis =6+frac{p^2}{1-p}!
entropy =frac{-(1-p)log_2 (1-p) - p log_2 p}{p}!
mgf =frac{pe^t}{1-(1-p) e^t}!
char =frac{pe^{it{1-(1-p),e^{it!
parameters2 =0< p leq 1 success probability (real)
support2 =k in {0,1,2,3,dots}!
pdf2 =(1 - p)^{k},p!
cdf2 =1-(1 - p)^{k+1}!
mean2 =frac{1-p}{p}!
median2 =
mode2 =0
variance2 =frac{1-p}{p^2}!
skewness2 =frac{2-p}{sqrt{1-p!
kurtosis2 =6+frac{p^2}{1-p}!
entropy2 =
mgf2 =frac{p}{1-(1-p)e^t}!
char2 =frac{p}{1-(1-p),e^{it!

In probability theory and statistics, the geometric distribution is either of two discrete probability distributions:

* the probability distribution of the number "X" of Bernoulli trials needed to get one success, supported on the set { 1, 2, 3, ...}, or

* the probability distribution of the number "Y" = "X" − 1 of failures before the first success, supported on the set { 0, 1, 2, 3, ... }.

Which of these one calls "the" geometric distribution is a matter of convention and convenience.

These two different geometric distributions should not be confused with each other. Often, the name "shifted" geometric distribution is adopted for the latter one (distribution of the number "Y"); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the range explicitly.

If the probability of success on each trial is "p", then the probability that the "k"th trial (out of "k" trials) is the first success is

:Pr(X = k) = (1 - p)^{k-1},p,

for "k" = 1, 2, 3, ....

Equivalently, if the probability of success on each trial is "p", then the probability that there are "k" failures before the first success is

:Pr(Y=k) = (1 - p)^k,p,

for "k" = 0, 1, 2, 3, ....

In either case, the sequence of probabilities is a geometric sequence.

For example, suppose an ordinary die is thrown repeatedly until the first time a "1" appears. The probability distribution of the number of times it is thrown is supported on the infinite set { 1, 2, 3, ... } and is a geometric distribution with "p" = 1/6.

Moments and cumulants

The expected value of a geometrically distributed random variable "X" is 1/"p" and the variance is (1 − "p")/"p"2:

:mathrm{E}(X) = frac{1}{p}, qquadmathrm{var}(X) = frac{1-p}{p^2}.

Similarly, the expected value of the geometrically distributed random variable "Y" is (1 − "p")/"p", and its variance is (1 − "p")/"p"2:

:mathrm{E}(Y) = frac{1-p}{p}, qquadmathrm{var}(Y) = frac{1-p}{p^2}.

Let "&mu;" = (1 − "p")/"p" be the expected value of "Y". Then the cumulants kappa_n of the probability distribution of "Y" satisfy the recursion

:kappa_{n+1} = mu(mu+1) frac{dkappa_n}{dmu}.

"Outline of proof:" That the expected value is (1 − "p")/"p" can be "shown" (why summation and integration is exchanged is never shown) in the following way. Let "Y" be as above. Then

: egin{align}mathrm{E}(Y) & {} =sum_{k=0}^infty (1-p)^k pcdot k \& {} =psum_{k=0}^infty(1-p)^k k \& {} = pleft(frac{d}{dp}-sum_{k=0}^infty (1-p)^k ight)(1-p) \& {} =-p(1-p)frac{d}{dp}frac{1}{p}=frac{1-p}{p}.end{align}

Parameter estimation

For both variants of the geometric distribution, the parameter "p" can be estimated by equating the expected value with the sample mean. This is the method of moments, which in this case happens to yield maximum likelihood estimates of "p".

Specifically, for the first variant let k_1,dots,k_n be a sample where k_i geq 1 for i=1,dots,n. Then "p" can be estimated as

:widehat{p} = left(frac1n sum_{i=1}^n k_i ight)^{-1}. !

In Bayesian inference, the Beta distribution is the conjugate prior distribution for the parameter "p". If this parameter is given a Beta("α", "&beta;") prior, then the posterior distribution is

:p sim mathrm{Beta}left(alpha+n, eta+sum_{i=1}^n (k_i-1) ight). !

The posterior mean mathrm{E} [p] approaches the maximum likelihood estimate widehat{p} as "α" and "&beta;" approach zero.

In the alternative case, let k_1,dots,k_n be a sample where k_i geq 0 for i=1,dots,n. Then "p" can be estimated as

:widehat{p} = left(1 + frac1n sum_{i=1}^n k_i ight)^{-1}. !

The posterior distribution of "p" given a Beta("α", "&beta;") prior is

:p sim mathrm{Beta}left(alpha+n, eta+sum_{i=1}^n k_i ight). !

Again the posterior mean mathrm{E} [p] approaches the maximum likelihood estimate widehat{p} as "α" and "&beta;" approach zero.

Other properties

* The probability-generating functions of "X" and "Y" are, respectively,

::G_X(s) = frac{s,p}{1-s,(1-p)}, !::G_Y(s) = frac{p}{1-s,(1-p)}, quad |s| < (1-p)^{-1}. !

* Like its continuous analogue (the exponential distribution), the geometric distribution is memoryless. That means that if you intend to repeat an experiment until the first success, then, given that the first success has not yet occurred, the conditional probability distribution of the number of additional trials does not depend on how many failures have been observed. The die one throws or the coin one tosses does not have a "memory" of these failures. The geometric distribution is in fact the only memoryless discrete distribution.

* Among all discrete probability distributions supported on {1, 2, 3, ... } with given expected value μ, the geometric distribution "X" with parameter "p" = 1/μ is the one with the largest entropy.

* The geometric distribution of the number "Y" of failures before the first success is infinitely divisible, i.e., for any positive integer "n", there exist independent identically distributed random variables "Y"1, ..., "Y""n" whose sum has the same distribution that "Y" has. These will not be geometrically distributed unless "n" = 1; they follow a negative binomial distribution.

* The decimal digits of the geometrically distributed random variable "Y" are a sequence of independent (and "not" identically distributed) random variables. For example, the hundreds digit "D" has this probability distribution:

::Pr(D=d) = {q^{100d} over {1 + q^{100} + q^{200} + cdots + q^{900},

:where "q" = 1 − "p", and similarly for the other digits, and, more generally, similarly for numeral systems with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions are indecomposable.

Related distributions

* The geometric distribution "Y" is a special case of the negative binomial distribution, with "r" = 1. More generally, if "Y"1,...,"Y""r" are independent geometrically distributed variables with parameter "p", then

::Z = sum_{m=1}^r Y_m

:follows a negative binomial distribution with parameters "r" and "p".

* If "Y"1,...,"Y""r" are independent geometrically distributed variables (with possibly different success parameters p^{(m)}), then their minimum

::W = min_{m} Y_m,

:is also geometrically distributed, with parameter "p" given by

::1-prod_{m}(1-p^{(m)}).

* Suppose 0 < "r" < 1, and for "k" = 1, 2, 3, ... the random variable "X""k" has a Poisson distribution with expected value "r""k"/"k". Then

::sum_{k=1}^infty k,X_k

:has a geometric distribution taking values in the set {0, 1, 2, ...}, with expected value "r"/(1 − "r").

* The exponential distribution is the continuous analogue of the geometric distribution. If a random variable with an exponential distribution is rounded up to the next integer then the result is a discrete random variable with a geometric distribution.

See also

* Coupon collector's problem

External links

*planetmath reference|id=3456|title=Geometric distribution
* [http://mathworld.wolfram.com/GeometricDistribution.html Geometric distribution] on MathWorld.


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Geometric stable distribution — Geometric Stable parameters: α ∈ (0,2] stability parameter β ∈ [−1,1] skewness parameter (note that skewness is undefined) λ ∈ (0, ∞) scale parameter μ ∈ (−∞, ∞) location parameter support: x ∈ R, or x ∈ [μ, +∞) if α < 1 and β = 1, or x ∈… …   Wikipedia

  • Geometric dimensioning and tolerancing — (GD T) is a system for defining and communicating engineering tolerances. It uses a symbolic language on engineering drawings and computer generated three dimensional solid models for explicitly describing nominal geometry and its allowable… …   Wikipedia

  • Geometric mean — The geometric mean, in mathematics, is a type of mean or average, which indicates the central tendency or typical value of a set of numbers. It is similar to the arithmetic mean, which is what most people think of with the word average, except… …   Wikipedia

  • Geometric standard deviation — In probability theory and statistics, the geometric standard deviation describes how spread out are a set of numbers whose preferred average is the geometric mean. If the geometric mean of a set of numbers { A 1, A 2, ..., A n } is denoted as μ g …   Wikipedia

  • Geometric algebra — In mathematical physics, a geometric algebra is a multilinear algebra described technically as a Clifford algebra over a real vector space equipped with a non degenerate quadratic form. Informally, a geometric algebra is a Clifford algebra that… …   Wikipedia

  • geometric twist — The variation along the span of the airfoil of the angle between the chord and a fixed datum. Propeller blades are designed with a twist (i.e., a higher angle at the hub than at the tip) to equalize the lift (thrust) distribution along the blade… …   Aviation dictionary

  • Negative binomial distribution — Probability mass function The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation. notation: parameters: r > 0 number of failures until the experiment is stopped (integer,… …   Wikipedia

  • Exponential distribution — Not to be confused with the exponential families of probability distributions. Exponential Probability density function Cumulative distribution function para …   Wikipedia

  • Discrete phase-type distribution — The discrete phase type distribution is a probability distribution that results from a system of one or more inter related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a… …   Wikipedia

  • Yule–Simon distribution — Probability distribution name =Yule–Simon type =mass pdf Yule–Simon PMF on a log log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.) cdf Yule–Simon CMF. (Note that the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”