Incremental encoding

Incremental encoding

Incremental encoding, also known as front compression, back compression, or front coding, is a type of delta encoding compression algorithm whereby common prefixes or suffixes and their lengths are recorded so that they need not be duplicated. This algorithm is particularly well-suited for compressing sorted data, e.g., a list of words from a dictionary.

For example:

The encoding used to store the common prefix length itself varies from application to application. Typical techniques are storing the value as a single byte; delta encoding, which store only the change in the common prefix length; and various universal codes. It may be combined with other general lossless data compression techniques such as entropy encoding and dictionary coders to compress the remaining suffixes.

Applications

Incremental encoding is widely used in information retrieval to compress the lexicons used in search indexes; these list all the words found in all the documents and a pointer for each one to a list of locations. Typically, it compresses these indexes by about 40%. [Ian H. Witten, Alistair Moffat, Timothy C. Bell. Managing Gigabytes. Second edition. Academic Press. ISBN 1-55860-5703. Section 4.1: Accessing the lexicon, subsection Front coding, pp.159–161.]

As one example, incremental encoding is used as a starting point by the GNU locate utility, in an index of filenames and directories. The GNU locate utility further uses bigram encoding to further shorten popular filepath prefixes.

References


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Incremental backup — An incremental backup preserves data by not creating multiple copies that are based on the differences in those data: a successive copy of the data contains only that portion which has changed since the preceding copy has been created. Contents 1 …   Wikipedia

  • Delta encoding — Not to be confused with Elias delta coding. Delta encoding is a way of storing or transmitting data in the form of differences between sequential data rather than complete files; more generally this is known as data differencing. Delta encoding… …   Wikipedia

  • List of algorithms — The following is a list of the algorithms described in Wikipedia. See also the list of data structures, list of algorithm general topics and list of terms relating to algorithms and data structures.If you intend to describe a new algorithm,… …   Wikipedia

  • Rotary encoder — A rotary encoder, also called a shaft encoder, is an electro mechanical device used to convert the angular position of a shaft or axle to an analog or digital code, making it an angle transducer. These devices are used in industrial controls,… …   Wikipedia

  • GNU locate — is a Unix utility to find files on filesystems. It searches through a prebuilt database of files generated by updatedb and compressed using incremental encoding. It is significantly faster than find, but requires the database to be updated… …   Wikipedia

  • Recall (memory) — Recollection redirects here. For other uses, see Recollection (disambiguation). Recall in memory refers to the retrieval of events or information from the past. Along with encoding and storage, it is one of the three core processes of memory.… …   Wikipedia

  • VTD-XML — Infobox Software name = VTD XML caption = developer = XimpleWare latest release version = 2.4 latest release date = April 2, 2008 latest preview version = latest preview date = operating system = Portable platform = blog = [http://vtd… …   Wikipedia

  • XML — Infobox file format name = Extensible Markup Language icon = logo = extension = .xml mime = application/xml, text/xml (deprecated) type code = uniform type = public.xml magic = owner = World Wide Web Consortium genre = Markup language container… …   Wikipedia

  • Portable Document Format — PDF redirects here. For other uses, see PDF (disambiguation). Portable Document Format Adobe Reader icon Filename extension .pdf Internet media type application/pdf application/x pdf application/x bzpdf application/x gzpdf …   Wikipedia

  • Comparison of text editors — This article provides basic comparisons for common text editors. More feature details for text editors are available from the Category of text editor features and from the individual products articles. This article may not be up to date or… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”