Linear predictive coding

Linear predictive coding

Linear predictive coding (LPC) is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. It is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate and provides extremely accurate estimates of speech parameters.

Overview

LPC starts with the assumption that a speech signal is produced by a buzzer at the end of a tube (voiced sounds), with occasional added hissing and popping sounds (sibilants and plosive sounds). Although apparently crude, this model is actually a close approximation to the reality of speech production. The glottis (the space between the vocal folds) produces the buzz, which is characterized by its intensity (loudness) and frequency (pitch). The vocal tract (the throat and mouth) forms the tube, which is characterized by its resonances, which give rise to formants, or enhanced frequency bands in the sound produced. Hisses and pops are generated by the action of the tongue, lips and throat during sibilants and plosives.

LPC analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz. The process of removing the formants is called inverse filtering, and the remaining signal after the subtraction of the filtered modeled signal is called the residue.

The numbers which describe the intensity and frequency of the buzz, the formants, and the residue signal, can be stored or transmitted somewhere else. LPC synthesizes the speech signal by reversing the process: use the buzz parameters and the residue to create a source signal, use the formants to create a filter (which represents the tube), and runs the source through the filter, resulting in speech.

Because speech signals vary with time, this process is done on short chunks of the speech signal, which are called frames; generally 30 to 50 frames per second give intelligible speech with good compression.

Early history of LPC

According to Robert M. Gray of Stanford University, the first ideas leading to LPC started in 1966 when S. Saito and F. Itakura of NTT described an approach to automatic phoneme discrimination that involved the first maximum likelihood approach to speech coding. In 1967, John Burg outlined the maximum entropy approach. In 1969 Itakura and Saito introduced partial correlation, May Glen Culler proposed realtime speech encoding, and B. S. Atal presented an LPC speech coder at the Annual Meeting of the Acoustical Society of America. In 1971 realtime LPC using 16-bit LPC hardware was demonstrated by Philco-Ford; four units were sold.

In 1972 Bob Kahn of ARPA, with Jim Forgie (Lincoln Laboratory, LL) and Dave Walden (BBN Technologies), started the first developments in packetized speech, which would eventually lead to Voice over IP technology. In 1973, according to Lincoln Laboratory informal history, the first realtime 2400 bit/s LPC was implemented by Ed Hofstetter. In 1974 the first realtime two-way LPC packet speech communication was accomplished over the ARPANET at 3500 bit/s between Culler-Harrison and Lincoln Laboratories. In 1976 the first LPC conference took place over the ARPANET using the Network Voice Protocol, between Culler-Harrison, ISI, SRI, and LL at 3500 bit/s. And finally in 1978, Vishwanath "et al." of BBN developed the first variable-rate LPC algorithm.

LPC coefficient representations

LPC is frequently used for transmitting spectral envelope information, and as such it has to be tolerant for transmission errors. Transmission of the filter coefficients directly (see linear prediction for definition of coefficients) is undesirable, since they are very sensitive to errors. In other words, a very small error can distort the whole spectrum, or worse, a small error might make the prediction filter unstable.

There are more advanced representations such as Log Area Ratios (LAR), line spectral pairs (LSP) decomposition and reflection coefficients. Of these, especially LSP decomposition has gained popularity, since it ensures stability of the predictor, and spectral errors are local for small coefficient deviations.

Applications

LPC is generally used for speech analysis and resynthesis. It is used as a form of voice compression by phone companies, for example in the GSM standard. It is also used for secure wireless, where voice must be digitized, encrypted and sent over a narrow voice channel, an early example of this is the US government's Navajo I.

LPC synthesis can be used to construct vocoders where musical instruments are used as excitation signal to the time-varying filter estimated from a singer's speech. This is somewhat popular in electronic music.
Paul Lansky made the well-known computer music piece notjustmoreidlechatter using linear predictive coding. [http://www.music.princeton.edu/~paul/liner_notes/morethanidlechatter.html] A 10th-order LPC was used in the popular 1980's Speak & Spell educational toy.

Waveform ROM in digital sample-based music synthesizers made by Yamaha Corporation is compressed using LPC algorithm.

0-to-32nd order LPC predictors are used in FLAC audio codec.

References

* [http://www-ee.stanford.edu/~gray/dl.html Robert M. Gray, IEEE Signal Processing Society, Distinguished Lecturer Program]

See also

*Warped Linear Predictive Coding
*Akaike information criterion
*Audio compression
*Pitch estimation
*FS-1015
*FS-1016

External links

* [http://soundlab.cs.princeton.edu/software/rt_lpc/ real-time LPC analysis/synthesis learning software]
* [http://www.hawksoft.com/hawkvoice/ HawkVoice open-source LPC software and API]
* [http://www.dspexperts.com/dsp/projects/lpc/ Speech Coding with Linear Predictive Coding (LPC)] DSP experts.com
* [http://www.engineer.tamuk.edu/SPark/chap7.pdf A good introduction to LPC] Dr. Sung-won Park [http://www.engineer.tamuk.edu/spark/spark.html] Texas A&M University-Kingsville [http://www.tamuk.edu/]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Linear Predictive Coding — (LPC) ist ein in der Audio Signalverarbeitung und Sprachverarbeitung unter anderem für die Audiodatenkompression und Sprachanalyse verwendetes Verfahren, das mittels Audiosynthese arbeitet. Dabei wird der Stimmtrakt (des Menschen) modellhaft… …   Deutsch Wikipedia

  • Linear Predictive Coding — El codificador de predicción lineal es un tipo de codificador ampliamente utilizado en audio digital. En sistemas de procesado de voz, se usa partiendo de la idea de que la voz puede modelarse como una combinación lineal de p muestras anteriores… …   Wikipedia Español

  • Warped Linear Predictive Coding — (Warped LPC or WLPC) is a variant of Linear predictive coding in which the spectral representation of the system is modified, for example by replacing the unit delays used in an LPC implementation with first order allpass filters. This can have… …   Wikipedia

  • Linear predictive analysis — can be thought of as a simple form of first order extrapolation: if it has been changing at this rate then it will probably continue to change at approximately the same rate, at least in the short term. This is equivalent to fitting a tangent to… …   Wikipedia

  • Adaptive predictive coding — (APC) is a narrowband analog to digital conversion that uses a one level or multilevel sampling system in which the value of the signal at each sampling instant is predicted according to a linear function of the past values of the quantized… …   Wikipedia

  • Linear prediction — is a mathematical operation where future values of a discrete time signal are estimated as a linear function of previous samples.In digital signal processing, linear prediction is often called linear predictive coding (LPC) and can thus be viewed …   Wikipedia

  • Linear prediction — Lineare Vorhersage (engl. linear prediction) ist ein mathematisches Verfahren der Zeitreihenanalyse, welches zukünftige Werte eines Signals beziehungsweise einer diskreten Zeitreihe als eine lineare Funktion der Werte der Vergangenheit der… …   Deutsch Wikipedia

  • Code-excited linear prediction — (CELP) is a speech coding algorithm originally proposed by M.R. Schroeder and B.S. Atal in 1985. At the time, it provided significantly better quality than existing low bit rate algorithms, such as residual excited linear prediction and linear… …   Wikipedia

  • Code-book Excited Linear Prediction — Code( book) Excited Linear Prediction (CELP) ist ein hybrides Audiokompressionsverfahren, das die Vorteile der Signalformkodierung mittels Vektorquantisierung und der parametrischen Verfahren vereint. Das Ergebnis ist eine gute Sprachqualität,… …   Deutsch Wikipedia

  • Code Excited Linear Prediction — Code( book) Excited Linear Prediction (CELP) ist ein hybrides Audiokompressionsverfahren, das die Vorteile der Signalformkodierung mittels Vektorquantisierung und der parametrischen Verfahren vereint. Das Ergebnis ist eine gute Sprachqualität,… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”