Because of their simplicity, text files are commonly used for storage[->0] of information. They avoid some of the problems encountered with other file formats, such as endianness[->1], padding bytes, or differences in the number of bytes in a machine word[->2]. Further, when data corruption[->3] occurs in a text file, it is often easier to recover and continue processing the remaining contents. A disadvantage of text files is that they usually have a low entropy[->4], meaning that the information occupies more storage than is strictly necessary.
A simple text file needs no additional metadata[->5] to assist the reader in interpretation, and therefore may contain no data at all, which is a case of zero byte file[->6]
.txt is a file format for files consisting of text usually containing very little formatting (e.g., no bolding[->7] or italics[->8]). The precise definition of the .txt format is not specified, but typically matches the format accepted by the system terminal[->9] or simple text editor[->10]. Files with the .txt extension can easily be read or opened by any program that reads text and, for that reason, are considered universal (or platform independent[->11]).
The ASCII character set[->12] is the most common format for English-language text files, and is generally assumed to be the default file format in many situations. For accented and other non-ASCII characters, it is necessary to choose a character encoding. In many systems, this is chosen on the basis of the default locale[->13] setting on the computer it is read on. Common character encodings include ISO 8859-1[->14] for many European languages.
Because many encodings have only a limited repertoire of characters, they are often only usable to represent text in a limited subset of human languages. Unicode[->15] is an attempt to create a common standard for representing all known languages, and most known character sets are subsets of the very large Unicode character set. Although there are...