Chinese restaurant process

Chinese restaurant process

In probability theory, the Chinese restaurant process is a discrete-time stochastic process, whose value at any positive-integer time n is a partition Bn of the set {1, 2, 3, ..., n} whose probability distribution is determined as follows. At time n = 1, the trivial partition { {1} } is obtained with probability 1. At time n + 1 the element n + 1 is either:

  1. added to one of the blocks of the partition Bn, where each block is chosen with probability |b|/(n + 1) where |b| is the size of the block, or
  2. added to the partition Bn as a new singleton block, with probability 1/(n + 1).

The random partition so generated is exchangeable in the sense that relabeling {1, ..., n} does not change the distribution of the partition, and it is consistent in the sense that the law of the partition of n − 1 obtained by removing the element n from the random partition at time n is the same as the law of the random partition at time n − 1.

Contents

Definition

One fancifully imagines a Chinese restaurant with an infinite number of circular tables, each with infinite capacity. Customer 1 is seated at an unoccupied table with probability 1. At time n + 1, a new customer chooses uniformly at random to sit at one of the following n + 1 places: directly to the left of one of the n customers already sitting at an occupied table, or at a new, unoccupied circular table. Clearly, each table corresponds to a block of a random partition. (For an account of the custom in actual Chinese restaurants, see table sharing). It is then immediate that the probability assigned to any particular partition (ignoring the order in which customers sit around any particular table) is


\Pr(B_n = B) = \dfrac{\prod_{b\in B} (|b| -1)!}{n!}

where b is a block in the partition B and |b| is the size (i.e. number of elements) of b. The restaurant analogy is due to Pitman and Dubins, and first appears in print in.[1]

Generalization

This construction can be generalized to a model with two parameters, α and θ,[2][3] commonly called the discount and strength (or concentration) parameters. At time n + 1, the next customer to arrive finds |B| occupied tables and decides to sit at an empty table with probability


\dfrac{\theta + |B| \alpha}{n + \theta},

or at an occupied table b of size |b| with probability


\dfrac{|b| -  \alpha}{n + \theta}.

In order for the construction to define a valid probability measure it is necessary to suppose that either α < 0 and θ = - Lα for some L ∈ {1, 2, ...}; or that 0 ≤ α ≤ 1 and θ > −α.

Under this model the probability assigned to any particular partition B of n is


\Pr(B_n = B) = 
    \dfrac{(\theta + \alpha)_{|B|, \alpha}}{(\theta+1)_{n-1, 1}} 
    \prod_{b\in B}(1-\alpha)_{|b|-1, 1}

where, by convention, (a)0,c = 1, and for b > 0


(a)_{b,c} = \prod_{i=0}^{b-1}(a+ic) = 
\begin{cases}
a^b & \text{if }c = 0, \\ \\
\dfrac{c^b\,\Gamma(a/c + b)}{\Gamma(a/c)} & \text{otherwise}. 
\end{cases}

Thus the partition probability can be expressed in terms of the Gamma function as


\Pr(B_n = B) =\dfrac{\Gamma(\theta)}{\Gamma(\theta+n)}\dfrac{\alpha^{|B|}\,\Gamma(\theta/\alpha + |B|) }{\Gamma(\theta/\alpha)}\prod_{b\in B}\dfrac{\Gamma(|b|-\alpha)}{\Gamma(1-\alpha)}.

In the one-parameter case, where α is zero, this simplifies to


\Pr(B_n = B) = \dfrac{\Gamma(\theta)\,\theta^{|B|}}{\Gamma(\theta+n)}\prod_{b\in B} \Gamma(|b|).

As before, the probability assigned to any particular partition depends only on the block sizes, so as before the random partition is exchangeable in the sense described above. The consistency property still holds, as before, by construction.

If α = 0, the probability distribution of the random partition of the integer n thus generated is the Ewens distribution with parameter θ, used in population genetics and the unified neutral theory of biodiversity.

Derivation

Here is one way to derive this partition probability. Let Ci be the random block into which the number i is added, for i = 1, 2, 3, ... . Then


\begin{align}
\Pr(C_i = c|C_1,\ldots,C_{i-1})
& {} = \begin{cases}
\dfrac{\theta + |B| \alpha }{\theta + i -1} & \text{if }c \in \text{new block}, \\  \\
\dfrac{|b| - \alpha }{\theta + i - 1} & \text{if }c\in b;
\end{cases}
\end{align}

The probability that Bn is any particular partition of the set { 1, ..., n } is the product of these probabilities as i runs from 1 to n. Now consider the size of block b: it increases by 1 each time we add one element into it. When the last element in block b is to be added in, the block size is (|b| − 1). For example, consider this sequence of choices: (generate a new block b)(join b)(join b)(join b). In the end, block b has 4 elements and the product of the numerators in the above equation gets θ · 1 · 2 · 3. Following this logic, we obtain Pr(Bn = B) as above.

Expected number of tables

For the one parameter case, with α = 0 and 0 < θ < ∞, the expected number of tables, given that there are n seated customers, is[4]


\begin{align}
\sum_{k=1}^n \frac{\theta}{\theta+k-1}.
\end{align}

In general case (α > 0) the expected number of occupied tables is[3]


\begin{align}
\frac{\Gamma(\theta+n+\alpha)\Gamma(\theta+1)}{\alpha \Gamma(\theta+n)\Gamma(\theta+\alpha)}-\frac{\theta}{\alpha}.
\end{align}

The Indian buffet process

It is possible to adapt the model such that each data point is no longer uniquely associated with a class (i.e. we are no longer constructing a partition), but may be associated with any combination of the classes. This strains the restaurant-tables analogy but can instead be described as a process in which each diner samples from some subset of an infinite selection of dishes on offer at a buffet. The probability that a particular diner samples a particular dish is proportional to the popularity of the dish among diners so far, and in addition the diner may sample from the untested dishes. This has been named the Indian buffet process and can be used to infer latent features in data.[5]

Applications

The Chinese restaurant process is closely connected to Dirichlet processes and Polya's urn scheme, and therefore useful in applications of nonparametric Bayesian methods including Bayesian statistics. The Generalized Chinese Restaurant Process is closely related to Pitman–Yor process. These processes have been used in many applications, including modeling text, clustering biological microarray data, biodiversity modelling and detecting objects in images[citation needed].

References

  1. ^ D. Aldous, "Exchangeability and Related Topics", in l'École d'été de probabilités de Saint-Flour, XIII–1983, pages 1–198. Berlin: Springer
  2. ^ Pitman, Jim (1995). "Exchangeable and Partially Exchangeable Random Partitions". Probability Theory and Related Fields 102 (2): 145–158. doi:10.1007/BF01213386. MR1337249. 
  3. ^ a b Pitman, Jim (2006). Combinatorial Stochastic Processes. Berlin: Springer-Verlag. http://works.bepress.com/jim_pitman/1/. 
  4. ^ Xinhua Zhang, "A Very Gentle Note on the Construction of Dirichlet Process", September 2008, The Australian National University, Canberra. Online: http://www.stat.purdue.edu/~zhang305/pubDoc/notes/dirichlet_process.pdf
  5. ^ Griffiths, T.L. and Ghahramani, Z. (2005) Infinite Latent Feature Models and the Indian Buffet Process. Gatsby Unit Technical Report GCNU-TR-2005-001.

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Chinese restaurant (disambiguation) — A Chinese restaurant is a food establishment serving Chinese cuisine, the term can also refer to: Overseas Chinese cuisine Overseas Chinese restaurant American Chinese cuisine Canadian Chinese cuisine Other Chinese restaurant syndrome,… …   Wikipedia

  • Chinese restaurant — See: * Chinese cuisine * American Chinese cuisine * Canadian Chinese cuisine * Chinese restaurant syndrome * Chinese restaurant process (a concept in probability theory) * Cantonese restaurant * The Chinese Restaurant , a second season episode of …   Wikipedia

  • Chinese noodles — Misua noodle making in Lukang, Taiwan Noodles are an essential ingredient and staple in Chinese cuisine. There is a great variety of Chinese noodles, which vary according to their region of production, ingredients, shape or width, and manner of… …   Wikipedia

  • Dirichlet process — In probability theory, a Dirichlet process is a stochastic process that can be thought of as a probability distribution whose domain is itself a random distribution. That is, given a Dirichlet process , where H (the base distribution) is an… …   Wikipedia

  • Chinese American history — is the history of Chinese Americans or the history of ethnic Chinese in the United States. Chinese immigration to the U.S. consisted of three major waves, with the first beginning in the 19th century. Chinese immigrants in the 19th century worked …   Wikipedia

  • Chinese tea culture — Traditional Chinese 中國茶文化 Simplified Chinese …   Wikipedia

  • Chinese people in Israel — Chinese in Israel Total population 23,000 Regions with significant populations Tel Aviv, Haifa, Jerusalem and other cities Languages Chinese, Hebrew Religion Predominantly …   Wikipedia

  • Chinese Islamic cuisine — This article is part of the series …   Wikipedia

  • Chinese Indonesians — For notable Indonesian people of Chinese descent, see List of Chinese Indonesians. Chinese Indonesians Chinese Indonesians pray at a temple in Glodok, Jakar …   Wikipedia

  • Chinese culture — For culture in mainland China after 1949, see Culture of the People s Republic of China. A Chinese opera (Beijing opera) performance in Beijing. Chinese culture is one of the world s oldest and most complex.[1] …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”