Text file

Text file

A text file (sometimes spelled "textfile") is a kind of computer file that is structured as a sequence of lines. A text file exists within a computer file system. The end of a text file is often denoted by placing one or more special characters, known as an end-of-file marker, after the last line in a text file.

"Text file" refers to a type of container, while plain text refers to a type of content. Text files can contain plain text, but they are not limited to such.

At a generic level of description, there are two kinds of computer files: text files and binary files.cite book
title = Computer Science Illuminated
first = John
last = Lewis
publisher = Jones and Bartlett
year = 2006
id = ISBN 0763741493
]

Data storage

Because of their simplicity text files are commonly used for storage of information. They avoid some of the problems encountered with other file formats, such as endianness, padding bytes, or differences in the number of bytes in a machine word. Further, when data corruption occurs in a text file, it is often easier to recover and continue processing the remaining contents. A disadvantage of text files is that they usually have a low entropy, meaning that the information occupies more storage than is strictly necessary.

Formats

ASCII

The ASCII standard allows ASCII-only text files (unlike most other file types) to be freely interchanged and readable on Unix, Macintosh, Microsoft Windows, DOS, and other systems. These differ in their preferred line ending convention and their interpretation of values outside the ASCII range (their character encoding).

MIME

Text files usually have the MIME type "text/plain", usually with additional information indicating an encoding. Prior to the advent of Mac OS X, the Mac OS system regarded the content of a file (the data fork) to be a text file when its resource fork indicated that the type of the file was "TEXT". Under the Windows operating system, a file is regarded as a text file if the suffix of the name of the file (the "extension") is "txt". However, many other suffixes are used for text files with specific purposes. For example, source code for computer programs is usually kept in text files that have file name suffixes indicating the programming language in which the source is written.

.txt

.txt is a filename extension for files consisting of text usually contain very little formatting (ex: no bolding or "italics"). The precise definition of the .txt format is not specified, but typically matches the format accepted by the system terminal or simple text editor. Files with the .txt extension can easily be read or opened by any program that reads text and, for that reason, are considered universal (or platform independent).

The ASCII character set is the most common format for English-language text files, and is generally assumed to be the default file format in many situations. For accented and other non-ASCII characters, it is necessary to choose a character encoding. In many systems, this is chosen on the basis of the default locale setting on the computer it is read on. Common character encodings include ISO 8859-1 for many European languages.

Because many encodings have only a limited repertoire of characters, they are often only usable to represent text in a limited subset of human languages. Unicode is an attempt to create a common standard for representing all known languages, and most known character sets are subsets of the very large Unicode character set. Although there are multiple character encodings available for Unicode, the most common is UTF-8, which has the advantage of being backwards-compatible with ASCII: that is, every ASCII text file is also a UTF-8 text file with identical meaning.

tandard Windows .txt files

Microsoft MS-DOS and Windows use a common text file format, with each line of text separated by a two character combination: CR and LF, which have ASCII codes 13 and 10. It is common for the last line of text not to be terminated with a CR-LF marker, and many text editors (including Notepad) do not automatically insert one on the last line.

Most Windows text files use a form of ANSI, OEM or Unicode encoding. What Windows terminology calls "ANSI encodings" are usually single-byte ISO-8859 encodings, except for in locales such as Chinese, Japanese and Korean that require double-byte character sets. ANSI encodings were traditionally used as default system locales within Windows, before the transition to Unicode. By contrast, OEM encodings, also known as MS-DOS code pages, were defined by IBM for use in the original IBM PC text mode display system. They typically include graphical and line-drawing characters common in full-screen MS-DOS applications. Newer Windows text files may use a Unicode encoding such as UTF-16LE or UTF-8.

Rendering

When opened by a text editor human-readable content is presented to the user. This often consists of the file's plain text visible to the user. Depending on the application, control codes may be rendered either as literal instructions acted upon by the editor, or as visible escape characters that can be edited as plain text. Though there may be plain text in a text file, control characters within the file (especially the end-of-file character) can render the plain text unseen by a particular method.

Notes and references

See also

*List of file formats
*File extensions
*ASCII
*EBCDIC
*Text editor
*Unicode

External links

* [http://c2.com/cgi/wiki?PowerOfPlainText C2: the Power of Plain Text]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • text file — ➔ file1 * * * text file UK US noun [C] IT ► a simple computer file containing only letters, numbers, and symbols: »You can store the information in a plain text file …   Financial and business terms

  • Text File —   [engl.], Textdatei …   Universal-Lexikon

  • text file — noun (computer science) a computer file that contains text (and possibly formatting instructions) using seven bit ASCII characters • Syn: ↑document • Derivationally related forms: ↑documentary (for: ↑document) • …   Useful english dictionary

  • text file — UK / US noun [countable] Word forms text file : singular text file plural text files computing a file that contains only words and numbers and no codes to show how the document should look …   English dictionary

  • text file — Any information that can be read, and is stored in a computer file. A text file can be any kind of information, such as a description of a computer program …   Dictionary of telecommunications

  • text file — tekstinis failas statusas T sritis informatika apibrėžtis Failas, kuriame laikomas ↑grynasis tekstas. Dažniausiai turi neutralų ↑prievardį TXT. Gali turėti jos paskirtį apibūdinantį prievardį (pvz., BAT, INF, INI). Tada kai kurios programos gali… …   Enciklopedinis kompiuterijos žodynas

  • text file — file containing only ASCII characters and not special characters …   English contemporary dictionary

  • text file — noun a) A simple data file containing only plain, human readable text, distinct from documents with embedded formatting b) A simple data file in a character encoding that allows it to be read in a simple editor: seven bit, as opposed to… …   Wiktionary

  • text file — /ˈtɛkst faɪl/ (say tekst fuyl) noun Computers a file containing unformatted text, that is, text without underlying information regarding font, word wrap, centring, etc., thus being of smaller size and presenting fewer difficulties in transfer and …  

  • text file —    See ASCII file …   Dictionary of networking

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”