Birthday problem


Birthday problem

In probability theory, the birthday problem, [This is not a paradox in the sense of leading to a logical contradiction, but is called a paradox because the mathematical truth contradicts naïve intuition: most people estimate that the chance is much lower than 50%.] pertains to the probability that in a set of randomly chosen people some pair of them will have the same birthday. In a group of 23 (or more) randomly chosen people, there is more than 50% probability that some pair of them will both have been born on the same day. For 57 or more people, the probability is more than 99%, tending toward 100% as the pool of people grows. [ Note that birthdays are not evenly distributed throughout the year; not only does February 29 occur less than a quarter as often as any other day, but birth rates vary for the other 365 days.] The mathematics behind this problem leads to a well-known cryptographic attack called the birthday attack.

"'

Understanding the problem

The birthday problem asks whether "any" of the 23 people have a matching birthday with "any" of the others — not one in particular. (See "Same birthday as you" below for an analysis of this much less surprising alternative problem.)

In a list of 23 people, comparing the birthday of the first person on the list to the others allows 22 chances for a matching birthday, but comparing every person to all of the others allows 253 distinct chances: in a group of 23 people there are 23×22/2 = 253 pairs. The probability that two people chosen from the entire population at random have the same birthday is 1/365. Although the pairings in a group of 23 people are not statistically equivalent to 253 pairs chosen independently, the birthday paradox becomes less surprising if a group is thought of in terms of the number of possible pairs, rather than the number of individuals.

It is easier to figure the probability that the birthdays will be different, such as: with one person they have 365 opportunities to have a different birthday. The second person only has 364 possibilities to have a different birthday than the first person. The third person has 363 days, and so on. Thus when the group reaches 366 a clash is inevitable — all the days will have been used up, save for leap years.

Calculating the probability

To compute the approximate probability that in a room of "n" people, at least two have the same birthday, we disregard variations in the distribution, such as leap years, twins, seasonal or weekday variations, and assume that the 365 possible birthdays are equally likely. Real-life birthday distributions are not uniform since not all dates are equally likely. [In particular, many children are born in the summer, especially the months of August and September (for the northern hemisphere) [http://scienceworld.wolfram.com/astronomy/LeapDay.html] , and in the U.S. it has been noted that many children are conceived around the holidays of Christmas and New Year's Day; and, in environments like classrooms where many people share a birth year, it becomes relevant that due to the way hospitals work, where C-sections and induced labor are not generally scheduled on the weekend, more children are born on Mondays and Tuesdays than on weekends. Both of these factors tend to increase the chance of identical birth dates, since a denser subset has more possible pairs (in the extreme case when everyone was born on three days, there would obviously be many identical birthdays). The birthday problem for such non-constant birthday probabilities was tackled by Murray Klamkin in 1967. A formal proof that the probability of two matching birthdays is least for a uniform distribution of birthdays was given by D. Bloom (1973)]

It is easier to first calculate the probability "p"("n") that all "n" birthdays are "different". If "n" > 365, by the pigeonhole principle this probability is 0. On the other hand, if "n" ≤ 365, it is

:ar p(n) = 1 imes left(1-frac{1}{365} ight) imes left(1-frac{2}{365} ight) cdots left(1-frac{n-1}{365} ight) = { 365 imes 364 cdots (365-n+1) over 365^n } = { 365! over 365^n (365-n)!}

because the second person cannot have the same birthday as the first (364/365), the third cannot have the same birthday as the first two (363/365), etc.

The event of at least two of the "n" persons having the same birthday is complementary to all "n" birthdays being different. Therefore, its probability "p"("n") is

: p(n) = 1 - ar p(n) .

This probability surpasses 1/2 for "n" = 23 (with value about 50.7%). The following table shows the probability for some other values of "n" (This table ignores the existence of leap years, as described above):

Thus in a group of just seven random people, it is more likely than not that two of them will have a birthday within a week of each other.M. Abramson and W. O. J. Moser (1970) "More Birthday Surprises", American Mathematical Monthly 77, 856–858]

Collision counting

The probability that the "k"th integer randomly chosen from [1, "d"] will repeat at least one previous choice equals "q"("k" − 1; "d") above. The expected total number of times a selection will repeat a previous selection as "n" such integers are chosen equals

:sum_{k=1}^n q(k-1;d) = n - d + d left (frac {d-1} {d} ight )^n.

Average number of people

In an alternative formulation of the birthday problem, one asks the "average" number of people required to find a pair with the same birthday. The problem is relevant to several hashing algorithms analyzed by Donald Knuth in his monumental work "The Art of Computer Programming". It may be shownD. E. Knuth; "The Art of Computer Programming. Vol. 3, Sorting and Searching" (Addison-Wesley, Reading, Massachusetts, 1973)] P. Flajolet, P. J. Grabner, P. Kirschenhofer, H. Prodinger (1995), "On Ramanujan's Q-Function", Journal of Computational and Applied Mathematics 58, 103–116] that if one samples uniformly, with replacement, from a population of size M, the number of trials required for the first repeated sampling of "some" individual has expected value scriptstyleoverline{n},=,1+Q(M), where

: Q(M)=sum_{k=1}^{M} frac{M!}{(M-k)! M^k}.

The function

: Q(M)= 1 + frac{M-1}{M} + frac{(M-1)(M-2)}{M^2} + cdots + frac{(M-1)(M-2) cdots 1}{M^{M-1

has been studied by Srinivasa Ramanujan and has asymptotic expansion:

: Q(M)simsqrt{frac{pi M}{2-frac{1}{3}+frac{1}{12}sqrt{frac{pi}{2n-frac{4}{135n}+ldots.

With "M" = 365 days in a year, the average number of people required to find a pair with the same birthday is scriptstyleoverline{n},=,1+Q(M)approx 24.61658, slightly more than the number required for a 50% chance. In the best case, two people will suffice; at worst, the maximum possible number of "M" + 1 = 366 people is needed; but on average, only 25 people are required.

An "informal" demonstration of the problem can be made from the List of Prime Ministers of Australia, in which Paul Keating, the 24th Prime Minister, is the first to share a birthday with another on the list.

Partition problem

A related problem is the partition problem, a variant of the knapsack problem from computer science. Some weights are put on a balance; each weight is an integer number of grams randomly chosen between one gram and one million grams (one metric ton). The question is whether you can usually (that is, with probability close to 1) transfer the weights between the left and right arms to balance the scale. (In case the sum of all the weights is an odd number of grams, a discrepancy of one gram is allowed.) If there are only two or three weights, the answer is very clearly no; although there are some combinations which work, the majority of randomly selected combinations of three weights do not. If there are very many weights, the answer is clearly yes. The question is, how many are just sufficient? That is, what is the number of weights such that it is equally likely for it to be possible to balance them as impossible?

Some people's intuition is that the answer is above 100,000. Most people's intuition is that it is in the thousands or tens of thousands, while others feel it should at least be in the hundreds. The correct answer is approximately 23.

The reason is that the correct comparison is to the number of partitions of the weights into left and right. There are 2"N"−1 different partitions for "N" weights, and the left sum minus the right sum can be thought of as a new random quantity for each partition. The distribution of the sum of weights is approximately Gaussian, with a peak at 1,000,000 "N" and width scriptstyle 1,000,000sqrt{N}, so that when 2"N"−1 is approximately equal to scriptstyle 1,000,000sqrt{N} the transition occurs. 223−1 is about 4 million, while the width of the distribution is only 5 million. [C. Borgs, J. Chayes, and B. Pittel (2001) "Phase Transition and Finite Size Scaling in the Integer Partition Problem", Random Structures and Algorithms 19(3–4), 247–288.]

Notes

References

*E. H. McKinney (1966) "Generalized Birthday Problem", American Mathematical Monthly 73, 385–387.
*M. Klamkin and D. Newman (1967) "Extensions of the Birthday Surprise", Journal of Combinatorial Theory 3, 279–282.
*M. Abramson and W. O. J. Moser (1970) "More Birthday Surprises", American Mathematical Monthly 77, 856–858
*D. Bloom (1973) "A Birthday Problem", American Mathematical Monthly 80, 1141–1142.
*Shirky, Clay "Here Comes Everybody: The Power of Organizing Without Organizations", (2008.) New York. 25–27.

External links

* http://www.efgh.com/math/birthday.htm
* http://planetmath.org/encyclopedia/BirthdayProblem.html
*
* [http://www.nestel.net/maple/bd/bd.html Maple vs. birthday paradox]
* [http://www.damninteresting.com/?p=402 A humorous article explaining the paradox]
* [http://www.excelexchange.com/Birthday%20Problem.xls The Birthday Problem Spreadsheet]
* [http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_Activities_BirthdayExperiment SOCR EduMaterials Activities BirthdayExperiment]
* [http://jeff.aaron.ca/cgi-bin/birthday The Birthday Paradox: Odds Calculator]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Birthday attack — A birthday attack is a type of cryptographic attack, so named because it exploits the mathematics behind the birthday problem in probability theory. Given a function f , the goal of the attack is to find two inputs x 1,x 2 such that f(x 1)=f(x 2) …   Wikipedia

  • Birthday (disambiguation) — A birthday is an annual celebration of the date on which a person was born.Birthday may also refer to:In music: * Birthday (The Beatles song) * Birthday (Flo Rida song) * Birthday (Junior Boys song) * Birthday (The Sugarcubes song) * Birthday… …   Wikipedia

  • Birthday Cake Interview — The Birthday Cake Interview refers to a famous political interview in Australia that was carried out between interviewer Mike Willesee and Liberal Party Opposition Leader Dr John Hewson shortly before the 1993 federal election. It is remembered… …   Wikipedia

  • German tank problem — During World War II, production of German tanks such as the Panther was accurately estimated by Allied intelligence using statistical methods. In the statistical theory of estimation, estimating the maximum of a uniform distribution is a common… …   Wikipedia

  • Plato's Problem — is the term given by Noam Chomsky to the gap between knowledge and experience. It presents the question of how we account for our knowledge when environmental conditions seem to be an insufficient source of information. It is used in linguistics… …   Wikipedia

  • Not the Captain's Birthday Party? — Live album by The Damned Released July 1986 Recorded November 27, 1977 Genre Punk rock …   Wikipedia

  • Coupon collector's problem — In probability theory, the coupon collector s problem describes the collect all coupons and win contests. It asks the following question: Suppose that there are n coupons, from which coupons are being collected with replacement. What is the… …   Wikipedia

  • Year 1900 problem — The year 1900 problem is a problem understanding which century before or after year 1900 an event occurred. Unlike the year 2000 problem, it is not a problem tied to computer software alone, since the problem existed before electronic computers… …   Wikipedia

  • Alma Problem — The Alma Problem is an issue of concern to musicologists, historians and biographers who deal with the lives and works of Gustav Mahler and his wife Alma.Alma Mahler (ultimately Alma Mahler Gropius Werfel) was not only an articulate, well… …   Wikipedia

  • The Captain's Birthday Party — is a live recording by The Damned, based on a recording from November 27, 1977 at The Roundhouse, in London. It was released in June, 1986. It should not be confused with Not The Captain s Birthday Party? , which is a different album, with… …   Wikipedia


Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.