 Adaptive Huffman coding

Adaptive Huffman coding (also called Dynamic Huffman coding) is an adaptive coding technique based on Huffman coding. It permits building the code as the symbols are being transmitted, having no initial knowledge of source distribution, that allows onepass encoding and adaptation to changing conditions in data.
The benefit of onepass procedure is that the source can be encoded in real time, though it becomes more sensitive to transmission errors, since just a single loss ruins the whole code.
Contents
Algorithms
There are a number of implementations of this method, the most notable are FGK (FallerGallagerKnuth) and Vitter algorithm.
Vitter algorithm
Code is represented as a tree structure in which every node has a corresponding weight and a unique number.
Numbers go down, and from right to left.
Weights must satisfy the sibling property, which states that nodes must be listed in the order of decreasing weight with each node adjacent to its sibling. Thus if A is the parent node of B and C is a child of B, then W(A) > W(B) > W(C).
The weight is merely the count of symbols transmitted which codes are associated with children of that node.
A set of nodes with same weights make a block.
To get the code for every node, in case of binary tree we could just traverse all the path from the root to the node, writing down (for example) "1" if we go to the right and "0" if we go to the left.
We need some general and straightforward method to transmit symbols that are "not yet transmitted" (NYT). We could use, for example, transmission of binary numbers for every symbol in alphabet.
Encoder and decoder start with only the root node, which has the maximum number. In the beginning it is our initial NYT node.
When we transmit an NYT symbol, we have to transmit code for the NYT node, then for its generic code.
For every symbol that is already in the tree, we only have to transmit code for its leaf node.
For every symbol transmitted both the transmitter and receiver execute the update procedure:
 If current symbol is NYT, add two child nodes to NYT node. One will be a new NYT node the other is a leaf node for our symbol. Increase weight for the new leaf node and the old NYT and go to step 4. If not, go to symbol's leaf node.
 If this node does not have the highest number in a block, swap it with the node having the highest number, except if that node is its parent [1]
 Increase weight for current node
 If this is not the root node go to parent node then go to step 2. If this is the root, end.
Note: swapping nodes means swapping weights and corresponding symbols, but not the numbers.
Example
Start with an empty tree.
For "a" transmit its binary code.
NYT spawns two child nodes: 254 and 255. Increase weight for root. Code for "a", associated with node 255, is 1.
For "b" transmit 0 (for NYT node) then its binary code.
NYT spawns two child nodes: 252 for NYT and 253 for leaf node. Increase weights for 253, 254, and root. Code for "b" is 01.
For the second "b" transmit 01.
Go to that leaf node, 253. We have a block of weights of 1 and the biggest number in the block is 255, so swap the weights and symbols of nodes 253 and 255, increase weight, go to root, increase weight for root.
Future code for "b" is 1, and for "a" is now 01, which reflects their frequency.
References
 Vitter's original paper: J. S. Vitter, "Design and Analysis of Dynamic Huffman Codes", Journal of the ACM, 34(4), October 1987, pp 825–845.
 J. S. Vitter, "ALGORITHM 673 Dynamic Huffman Coding", ACM Transactions on Mathematical Software, 15(2), June 1989, pp 158–167. Also appears in Collected Algorithms of ACM.
External links
 Paul E. Black, adaptive Huffman coding at the NIST Dictionary of Algorithms and Data Structures.
 University of California Dan Hirschberg site
 Cardiff University Dr. David Marshall site
 C implementation of Vitter algorithm
 Excellent description from Duke University
Data compression methods Information theory Lossless Shannon–Fano · Shannon–Fano–Elias · Huffman · Adaptive Huffman · Arithmetic · Range · Golomb · Universal (Gamma · ExpGolomb · Fibonacci · Levenshtein)RLE · Byte pair encoding · DEFLATE · Lempel–Ziv (LZ77/78 · LZSS · LZW · LZWL · LZO · LZMA · LZX · LZRW · LZJB · LZS · LZT · ROLZ) · Statistical Lempel ZivOthersAudio Audio codec partsOthersImage TermsMethodsOthersVideo TermsVideo characteristics · Frame · Frame rate · Interlace · Frame types · Video quality · Video resolutionOthersSee Compression formats for formats and Compression software implementations for codecsCategories: Lossless compression algorithms
Wikimedia Foundation. 2010.
Look at other dictionaries:
Huffman coding — Huffman tree generated from the exact frequencies of the text this is an example of a huffman tree . The frequencies and codes of each character are below. Encoding the sentence with this code requires 135 bits, as opposed of 288 bits if 36… … Wikipedia
Modified Huffman coding — is used in fax machines to encode black on white images (bitmaps). It combines the variable length codes of Huffman coding with the coding of repetitive data in run length encoding. External links Modified Huffman coding from UNESCO . Archived… … Wikipedia
Adaptive Binary Optimization — Adaptive Binary Optimization, (ABO), is a supposed lossless image compression algorithm by MatrixView Ltd. It uses a patented method to compress the high correlation found in digital content signals and additional compression with standard… … Wikipedia
Arithmetic coding — is a method for lossless data compression. Normally, a string of characters such as the words hello there is represented using a fixed number of bits per character, as in the ASCII code. Like Huffman coding, arithmetic coding is a form of… … Wikipedia
Advanced Video Coding — H.264 Pour les articles homonymes, voir AVC. H.264, ou MPEG 4 AVC (Advanced Video Coding), est une norme de codage vidéo développée conjointement par l UIT T Q.6/SG16 Video Coding Experts Group (VCEG) ainsi que l ISO/CEI Moving Picture Experts… … Wikipédia en Français
NegaFibonacci coding — Numeral systems by culture Hindu Arabic numerals Western Arabic (Hindu numerals) Eastern Arabic Indian family Tamil Burmese Khmer Lao Mongolian Thai East Asian numerals Chinese Japanese Suzhou Korean Vietnamese … Wikipedia
Golomb coding — is a data compression scheme invented by Solomon W. Golomb in the 1960s. The scheme is based on entropy encoding and is optimal (in the sense of Shannon s source coding theorem) for alphabets following a geometric distribution, making it highly… … Wikipedia
Contextadaptive variablelength coding — Le Context adaptive variable length coding ou CAVLC est une forme de codeur entropique à longueur variable utilisé dans la norme vidéo H.264 ou MPEG 4 AVC. Il fait partie des techniques de compression sans perte, c est à dire qu à partir du code… … Wikipédia en Français
ContextAdaptive Variable Length Coding — Las siglas CAVLC corresponden a las iniciales de Context Adaptive Variable Length Coding, que traducido del inglés significa codificación adaptativa según el contexto de longitud variable. El objetivo de esta codificación es procesar la… … Wikipedia Español
Advanced Video Coding — H.264/MPEG 4 AVC ist ein Standard zur hocheffizienten Videokompression. Er wurde zunächst von der ITU (Study Group 16, Video Coding Experts Group) unter dem Namen H.26L entwickelt. Im Jahre 2001 schloss sich die ITU Gruppe mit MPEG Visual… … Deutsch Wikipedia