Finite state transducer

Finite state transducer

A finite-state transducer (FST) is a finite state machine with two tapes: an input tape and an output tape. This contrasts with an ordinary finite state automaton (or finite state acceptor), which has a single tape.

Overview

An automaton can be said to "recognize" a string if we view the content of its tape as input. In other words, the automaton computes a function that maps strings into the set {0,1}. Alternatively, we can say that an automaton "generates" strings, which means viewing its tape as an output tape. On this view, the automaton generates a formal language, which is a set of strings. The two views of automata are equivalent: the function that the automaton computes is precisely the indicator function of the set of strings it generates. The class of languages generated by finite automata is known as the class of regular languages.

The two tapes of a transducer are typically viewed as an input tape and an output tape. On this view, a transducer is said to "transduce" (i.e., translate) the contents of its input tape to its output tape, by accepting a string on its input tape and generating another string on its output tape. It may do so nondeterministically and it may produce more than one output for each input string. A transducer may also produce no output for a given input string, in which case it is said to "reject" the input. In general, a transducer computes a relation between two formal languages. The class of relations computed by finite state transducers is known as the class of rational relations.

Finite-state transducers are often used for morphological analysis in natural language processing research and applications.

Formal construction

Formally, a finite transducer "T" is a tuple ("Q", Σ, Γ, "I", "F", δ) such that:

* "Q" is a finite set, the set of "states";
* Σ is a finite set, called the "input alphabet";
* Γ is a finite set, called the "output alphabet";
* "I" is a subset of "Q", the set of "initial states";
* "F" is a subset of "Q", the set of "final states"; and
* delta subseteq Q imes (Sigmacup{epsilon}) imes (Gammacup{epsilon}) imes Q (where ε is the empty string) is the "transition relation".

We can view ("Q", δ) as a labeled directed graph, known as the "transition graph" of "T": the set of vertices is "Q", and (q,a,b,r)indelta means that there is a labeled edge going from vertex "q" to vertex "r". We also say that "a" is the "input label" and "b" the "output label" of that edge.

NOTE: This definition of finite transducer is also called "letter transducer" (Roche and Schabes 1997); alternative definitions are possible, but can all be converted into transducers following this one.

Define the "extended transition relation" delta^* as the smallest set such that:

* deltasubseteqdelta^*;
* (q,epsilon,epsilon,q)indelta^* for all qin Q; and
* whenever (q,x,y,r) in delta^* and (r,a,b,s) in delta then (q,xa,yb,s) in delta^*.

The extended transition relation is essentially the reflexive transitive closure of the transition graph that has been augmented to take edge labels into account. The elements of delta^* are known as "paths". The edge labels of a path are obtained by concatenating the edge labels of its constituent transitions in order.

The "behavior" of the transducer "T" is the rational relation ["T"] defined as follows: x [T] y if and only if there exists i in I and f in F such that (i,x,y,f) in delta^*. This is to say that "T" transduces a string xinSigma^* into a string yinGamma^* if there exists a path from an initial state to a final state whose input label is "x" and whose output label is "y".

Operations on finite state transducers

The following operations defined on finite automata also apply to finite transducers:

* Union. Given transducers "T" and "S", there exists a transducer Tcup S such that x [Tcup S] y if and only if x [T] y or x [S] y.

* Concatenation. Given transducers "T" and "S", there exists a transducer Tcdot S such that wx [Tcdot S] yz if and only if w [T] y and x [S] z.

* Kleene closure. Given a transducer "T", there exists a transducer T^* with the following properties: (1) epsilon [T^*] epsilon; (2) if w [T^*] y and x [T] z then wx [T^*] yz; and x [T^*] y does not hold unless mandated by (1) or (2).

Note that there is no notion of intersection of transducers. Instead, there is an operation of "composition" that is specific to transducers and whose construction is similar to that of intersection of automata. Composition is defined as follows:

* Given a transducer "T" on alphabets Σ and Γ and a transducer "S" on alphabets Γ and Δ, there exists a transducer T circ S on Σ and Δ such that x [T circ S] z if and only if there exists a string yinGamma^* such that x [T] y and y [S] z.

This definition uses the same notation which is used in mathematics for relation composition. However, the conventional reading for relation composition is the other way around: given two relations T and S, (x,z)in Tcirc S when there exist some y such that (x,y)in S and (y,z)in T.

One can also project out either tape of a transducer to obtain an automaton. There are two projection functions: pi_1 preserves the input tape, and pi_2 preserves the output tape. The first projection, pi_1 is defined as follows:

* Given a transducer "T", there exists a finite automaton pi_1 T such that pi_1 T accepts "x" if and only if there exists a string "y" for which x [T] y.

The second projection, pi_2 is defined similarly.

Additional properties of finite state transducers

* It is decidable whether the relation ["T"] of a transducer "T" is empty.

* It is decidable whether there exists a string "y" such that "x" ["T"] "y" for a given string "x".

* It is undecidable whether two transducers are equivalent.

* If one defines the alphabet of labels L=(Sigmacup{epsilon}) imes (Gammacup{epsilon}), finite state transducers are isomorphic to NDFA over the alphabet L, and may therefore be determinized (turned into deterministic finite state machines over the alphabet L= [(Sigmacup{epsilon}) imes Gamma] cup [Sigma imes (Gammacup{epsilon})] ) and subsequently minimized so that they have the minimum number of states.fact|date=September 2008

ee also

* Mealy machine
* Morphological dictionary

Notes

Further reading

*
*
*
*
*


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Finite-state machine — State machine redirects here. For infinite state machines, see State transition system. For fault tolerance methodology, see State machine replication. SFSM redirects here. For the Italian railway company, see Circumvesuviana. A finite state… …   Wikipedia

  • Finite state machine — A finite state machine (FSM) or finite state automaton (plural: automata ) or simply a state machine, is a model of behavior composed of a finite number of states, transitions between those states, and actions. A finite state machine is an… …   Wikipedia

  • Finite-State-Machine — Abb.1 Beispiel eines EA Ein endlicher Automat (EA, auch Zustandsmaschine, englisch finite state machine (FSM)) ist ein Modell des Verhaltens, bestehend aus Zuständen, Zustandsübergängen und Aktionen. Ein Automat heißt endlich, wenn die Menge der… …   Deutsch Wikipedia

  • Finite State Machine — Abb.1 Beispiel eines EA Ein endlicher Automat (EA, auch Zustandsmaschine, englisch finite state machine (FSM)) ist ein Modell des Verhaltens, bestehend aus Zuständen, Zustandsübergängen und Aktionen. Ein Automat heißt endlich, wenn die Menge der… …   Deutsch Wikipedia

  • List of terms relating to algorithms and data structures — The [http://www.nist.gov/dads/ NIST Dictionary of Algorithms and Data Structures] is a reference work maintained by the U.S. National Institute of Standards and Technology. It defines a large number of terms relating to algorithms and data… …   Wikipedia

  • Список терминов, относящихся к алгоритмам и структурам данных —   Это служебный список статей, созданный для координации работ по развитию темы.   Данное предупреждение не устанавливается на информационные списки и глоссарии …   Википедия

  • Список терминов — Список терминов, относящихся к алгоритмам и структурам данных   Это сл …   Википедия

  • Cone (formal languages) — In formal language theory, a cone is a set of formal languages that has some desirable closure properties enjoyed by some well known sets of languages, in particular by the families of regular languages, context free languages and the recursive… …   Wikipedia

  • Moore machine — In the theory of computation, a Moore machine is a finite state machine, whose output values are determined solely by its current state.  Contents 1 Name 2 Formal definition 3 Visual representation …   Wikipedia

  • Speech recognition — For the human linguistic concept, see Speech perception. The display of the Speech Recognition screensaver on a PC, in which the character responds to questions, e.g. Where are you? or statements, e.g. Hello. Speech recognition (also known as… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”