Chomsky–Schützenberger theorem

Chomsky–Schützenberger theorem

In formal language theory, the Chomsky–Schützenberger theorem is a statement about the number of words of a given length generated by an unambiguous context-free grammar. The theorem provides an unexpected link between the theory of formal languages and abstract algebra. It is named after Noam Chomsky and Marcel-Paul Schützenberger.

Statement of the theorem

In order to state the theorem, we need a few notions from algebra and formal language theory.

A power series over \mathbb{N} is an infinite series of the form

f = f(x) = \sum_{k=0}^\infty a_k x^k = a_0 + a_1 x^1 + a_2 x^2 + a_3 x^3 + \cdots

with coefficients ak in \mathbb{N}. The multiplication of two formal power series f and g is defined in the expected way as the convolution of the sequences an and bn:

f(x)\cdot g(x) = \sum_{k=0}^\infty \left(\sum_{i=0}^k a_i b_{k-i}\right) x^k.

In particular, we write f^2 = f(x)\cdot f(x), f^3 = f(x)\cdot f(x)\cdot f(x), and so on. In analogy to algebraic numbers, a power series f(x) is called algebraic over \mathbb{Q}(x), if there exists a finite set of polynomials p_0(x), p_1(x), p_2(x), \ldots , p_n(x) each with rational coefficients such that

p_0(x) + p_1(x) \cdot f + p_2(x)\cdot f^2 + \cdots + p_n(x)\cdot f^n = 0.

Having established the necessary notions, the theorem is stated as follows.

Chomsky–Schützenberger theorem. If L is a context-free language admitting an unambiguous context-free grammar, and a_k := | L \ \cap \Sigma^k | is the number of words of length k in L, then \sum_{k = 0}^\infty a_k x^k is a power series over \mathbb{N} that is algebraic over \mathbb{Q}(x).

Proofs of this theorem are given by Kuich & Salomaa (1985), and by Panholzer (2005).

References

  • Chomsky, Noam & Schützenberger, Marcel-Paul: "The Algebraic Theory of Context-Free Languages," in Computer Programming and Formal Systems, P. Braffort and D. Hirschberg (eds.), North Holland, pp. 118–161, 1963. Available online.
  • Kuich, Werner & Salomaa, Arto: Semirings, Automata, Languages. Springer, 1985.
  • Panholzer, Alois: "Gröbner Bases and the Defining Polynomial of a Context-free Grammar Generating Function." Journal of Automata, Languages and Combinatorics 10:79–97, 2005.

Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Noam Chomsky — Chomsky redirects here. For other topics with the same name, see Chomsky (disambiguation). Noam Chomsky Noam Chomsky visiting Vancouver, Canada in 2004 …   Wikipedia

  • Marcel-Paul Schützenberger — Born October 24, 1920(1920 10 24) Paris Died July 29, 1996(1996 07 29) (aged 75) …   Wikipedia

  • Dyck language — In the theory of formal languages of computer science, mathematics, and linguistics, the Dyck language is the language consisting of balanced strings of parentheses [ and ]. It is important in the parsing of expressions that must have a correctly …   Wikipedia

  • Combinatorics — is a branch of mathematics concerning the study of finite or countable discrete structures. Aspects of combinatorics include counting the structures of a given kind and size (enumerative combinatorics), deciding when certain criteria can be met,… …   Wikipedia

  • Langage rationnel — Les langages rationnels ou langages réguliers ou encore langages reconnaissables peuvent être décrits de plusieurs façons équivalentes: ce sont les langages décrits par les expressions régulières ou rationnelles,d où le nom de langages réguliers; …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”