Uuencoding

Uuencoding

Uuencoding is a form of binary-to-text encoding that originated in the Unix program uuencode, for encoding binary data for transmission over the uucp mail system. The name "uuencoding" is derived from "Unix-to-Unix encoding". Since uucp converted characters between various computers' character sets, uuencode was used to convert the data to fairly common characters that were unlikely to be "translated" and thereby destroy the file. The program uudecode reverses the effect of uuencode, recreating the original binary file exactly. uuencode/decode became popular for sending binary files by e-mail and posting to usenet newsgroups, etc. It has now been largely replaced by MIME and yEnc. With MIME, files that might have been uuencoded are transferred with base64 encoding.

Encoded format

A file in uuencoded format starts with a header line of the form: begin Where is the file's Unix read/write/execute permissions as three octal digits, and is the name to be used when recreating the binary data. The file ends with two trailer lines: ` end(The grave accent indicates a line that encodes zero bytes; see below.)

Lines between the header and trailer encode data.

Each data line starts with a character indicating the number of data bytes encoded on that line and ends with a newline character. All data lines, except perhaps the last, encode 45 bytes of data. The corresponding encoded length value is 'M' (see below), so most lines begin with 'M'.

A data line subsequently contains group of four characters that encode three bytes of data. If the number of data bytes for a line is not divisible by three, one or two additional zero bytes are appended to the input data before encoding; the encoding always has groups of four characters. Those padding bytes are "not" included in the count at the beginning of the last line.

A data line's byte count is encoded by adding 32 and using the corresponding ASCII character, except that a byte count of zero is encoded as grave accent ("`", code 96).

(In ASCII the first thirty-two characters are unprintable and controlled data transmission. They could be modified or deleted by transmission. The next ninety-five characters at code 32 and above are all printable. Since the byte count is in the range 0-45, adding 32 converts it into a printable character. The ASCII code for 'M' is 77, or exactly 45 + 32. For a zero-length line, adding 32 to 0 gives 32, corresponding to a space character. This character was also problematic for data transmission, so the grave accent ("`", code 96) is used instead. Subtracting 32 produces a value whose lower six bits are 0.)

Each group of three bytes is encoded into four characters. The bytes are concatenated into a 24-bit value in big-endian order. (The first byte become the most significant 8 bits of the value.) The 24-bit value is then split into four groups of six bits each, also in big-ending order. (The most significant six bits becomes the first group.) Each group of six bits is then encoded into a character using the same calculation as for byte counts. (Since the range of values is from 0 to 63, when 32 is added the ASCII characters will lie in the range 32 (space) to 32 + 63 = 95 (underscore).) ASCII characters greater than 95 may also be used; however, only the six right-most bits are relevant.

Sometimes each data line has extra dummy characters (often the grave accent (ASCII 96)) added to avoid problems with mailers that strip trailing spaces. These characters are ignored by uudecode. The grave accent can also be used in place of a space character.

As a complete file, the uuencoded output for (the ASCII bytes representing the string) "Cat" would be begin 644 cat.txt #0V%T ` endThe begin line is a standard uuencode header; the '#' indicates that its line encodes three characters; the last two lines appear at the end of all uuencoded files.

ample uuencoding

The encoding process is demonstrated by this table, which shows the derivation of the above encoding for "Cat".

Uuencode table

The following table represents the subset of ASCII characters used by UUEncode and the 6-bit binary string they represent (in octal).

POSIX Base64 coding

Despite its limited range of characters, uuencoded data is sometimes mangled on passage through certain old computers. The worst offenders are computers using non-ASCII character sets such as EBCDIC. One attempt to fix the problem was the Xxencode format, which used only alphanumeric characters and the plus and minus symbols. More common today is the Base64 format; it can also be generated by the uuencode program. The header is changed to begin-base64 the trailer becomes =

and lines between are encoded with characters chosen from ABCDEFGHIJKLMNOP QRSTUVWXYZabcdef ghijklmnopqrstuv wxyz0123456789+/

Trivia

Microsoft's E-mail-program Outlook Express once erroneously accepted "begin " as the start of UUEncoded attachments ("i.e.", not requiring octal encoded UNIX-style permissions). Especially in Usenet, where MIME is seldom usedFact|date=January 2007 and plain text is preferred, some people would embed begin, space, space in their messages in order to maliciously hide the rest of the message from Outlook Express users ("e.g.", they configured their news-client to quote starting with the line "begin quote from xxx") [http://support.microsoft.com/default.aspx/kb/898124] .

ee also

* Base64
* BinHex
* MIME
* XXEncode
* YEnc

References

* [http://www.opengroup.org/onlinepubs/009695399/utilities/uuencode.html IEEE Std 1003.1 uuencode man page]

External links

* [http://www.gnu.org/software/sharutils/ GNU sharutils] - The Free Software Foundation's sharutils bundle includes uuencode, uudecode, and others.
* [http://www.fpx.de/fp/Software/UUDeview/ UUDeview] - open-source program to encode/decode Base64, BinHex, uuencode, xxencode, etc. for Unix/Windows/DOS
* [http://www.bastet.com/ UUENCODE-UUDECODE] - open-source program to encode/decode created by Clem "Grandad" Dye
* [http://www.stuartcheshire.org/StUU.html StUU] - Open Source fast UUDecoder for Macintosh by Stuart Cheshire


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Binary-to-text encoding — A binary to text encoding is encoding of data in plain text. More precisely, it is an encoding of binary data in a sequence of ASCII printable characters. These encodings are necessary for transmission of data when the channel or the protocol… …   Wikipedia

  • MIME — This article is about the email content type system. For the World Wide Web content type system, see Internet media type. For mime as an art form, see Mime artist. For the British engineering society, see Institution of Mechanical Engineers.… …   Wikipedia

  • Base64 — Numeral systems by culture Hindu Arabic numerals Western Arabic (Hindu numerals) Eastern Arabic Indian family Tamil Burmese Khmer Lao Mongolian Thai East Asian numerals Chinese Japanese Suzhou Korean Vietnamese …   Wikipedia

  • Parchive — Filename extension .par, .par2, .par3, pa3, .p?? Type of format forward error correction Parchive (a contraction of parity archive volume set) is an open source software project that emerged in 2001 to develop a parity file format, as conceived… …   Wikipedia

  • YEnc — is a binary to text encoding scheme for transferring binary files in messages on Usenet or via e mail. It reduces the overhead over previous US ASCII based encoding methods by using an 8 bit Extended ASCII encoding method. yEnc s overhead is… …   Wikipedia

  • Pegasus Mail — Infobox Software name = Pegasus Mail logo = caption = Pegasus Mail 4.41 under Windows XP developer = David Harris latest release version = 4.41 latest release date = July 16, 2006 latest preview version = latest preview date = operating system =… …   Wikipedia

  • E-mail attachment — An e mail attachment (or email attachment) is a computer file which is sent along with an e mail message. The file is not a separate message, but now it is almost universally sent as part of the message to which it is attached.Attached messages… …   Wikipedia

  • Matchmaker.com — is an internet (and, prior to the internet, dial up) dating service. It was founded in 1986, making it the oldest of the current online dating sites. From 2000 to January 2006, it was run by Lycos. In January 2006, it was purchased by Date.com.… …   Wikipedia

  • The nzb Project (Usenet client) — From the project about page, The nzb Project provides a library and graphical Usenet client for nzb based NNTP downloading and streaming. [ [http://nzb.sourceforge.net/about/ The nzb Project » About ] ] Versions are available for Windows and… …   Wikipedia

  • Fichier par — Parchive Parchive est un système correcteur d erreurs qui peut être appliqué à un ensemble de fichiers pour permettre leur reconstruction lorsqu un ou plusieurs de ces fichiers sont manquants, incomplets ou endommagés. Sommaire 1 Historique 2 Vue …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”