Character encoding : Character set

A character encoding is a code that pairs a set of natural language character (computing)s (such as an alphabet or syllabary) with a set of something else, such as numbers or electrical pulses. Common examples include Morse code, which encodes letters of the Roman alphabet as series of long and short depressions of a telegraph key; and ASCII, which encodes letters, numerals, and other symbols as both integers and 7-bit binary versions of those integers.

In some contexts (especially computer storage and communication) it makes sense to distinguish a character repertoire, which is a full set of abstract characters that a system supports, from a coded character set or character encoding which specifies how to represent characters from that set using a number of integer codes.

In the early days of computing, most systems used only the character repertoire of the ASCII code. This was soon seen to be inadequate, and a number of ad-hoc methods were used to extend this. The need to support multiple writing systems, including the CJK family of scripts, required a far larger number of characters to be supported, and required a systematic approach to character encoding to be used, rather than the previous ad-hoc approaches.

For example, the full repertoire of Unicode encompasses over 100,000 characters, each being assigned a unique integer code in the range 0 to hexadecimal 10FFFF (a little over 1.1 million, so not all integers in that range represent coded characters). Other common repertoires include ASCII and ISO 8859-1, which are identical to the first 128 and 256 coded characters of Unicode respectively.

The term character encoding is sometimes overloaded to also mean how characters are represented as a specific sequence of bits. This involves an encoding form where the integer code is converted to a series of integer code values that facilitate storage in a system that uses fixed bit widths. For example, integers greater than 65535 will not fit in 16 bits, so the UTF-16 encoding form mandates that these integers be represented as a surrogate pair of integers that are less than 65536 and that are not assigned to characters (e.g., hex 10000 becomes the pair D800 DC00). An encoding scheme then converts code values to bit sequences, with attention given to things like platform-dependent byte order issues (e.g. D800 DC00 might become 00 D8 00 DC on an Intel x86 architecture). A character set or character map or code page shortcuts this process by directly mapping abstract characters to specific bit patterns. Unicode Technical Report #17 explains this terminology in depth and provides further examples.

Since most applications use only a small subset of Unicode, encoding schemes like UTF-8 and UTF-16, and character maps like ASCII, provide efficient ways to represent Unicode characters in computer storage or communications using short binary words. Some of these simple text encodings use data compression techniques to represent a large repertoire with a smaller number of codes.

Popular character encodings:

Links:



Common misspelling and questions (FAQ)

haracter-set  caracter-set  chracter-set  chaacter-set  charcter-set  charater-set  characer-set  charactr-set  characte-set  characterset  character-et  character-st  character-se  hcaracter-set  cahracter-set  chraacter-set  chaarcter-set  charcater-set  charatcer-set  characetr-set  charactre-set  characte-rset  characters-et  character-est  character-ste  character-se  ccharacter-set  chharacter-set  chaaracter-set  charracter-set  charaacter-set  characcter-set  charactter-set  characteer-set  characterr-set  character--set  character-sset  character-seet  character-sett  dharacter-set  xharacter-set  fharacter-set  fharacter-set  vharacter-set  cyaracter-set  cgaracter-set  cbaracter-set  cuaracter-set  cnaracter-set  cuaracter-set  cjaracter-set  cnaracter-set  chqracter-set  chwracter-set  chzracter-set  chwracter-set  chsracter-set  chzracter-set  cha4acter-set  chaeacter-set  chadacter-set  cha5acter-set  chafacter-set  cha5acter-set  chatacter-set  chafacter-set  charqcter-set  charwcter-set  charzcter-set  charwcter-set  charscter-set  charzcter-set  charadter-set  charaxter-set  charafter-set  charafter-set  charavter-set  charac5er-set  characrer-set  characfer-set  charac6er-set  characger-set  charac6er-set  characyer-set  characger-set  charact3r-set  charactwr-set  charactsr-set  charact4r-set  charactdr-set  charact4r-set  charactrr-set  charactdr-set  characte4-set  charactee-set  characted-set  characte5-set  charactef-set  characte5-set  charactet-set  charactef-set  character0set  characterpset  character[set  character-wet  character-aet  character-zet  character-eet  character-xet  character-eet  character-det  character-xet  character-s3t  character-swt  character-sst  character-s4t  character-sdt  character-s4t  character-srt  character-sdt  character-se5  character-ser  character-sef  character-se6  character-seg  character-se6  character-sey  character-seg  characyer-set  charactyer-set  character-sets 


Heat from Sun 8 Carboy of Acid bursting 2 Shirts falling into fire 6 Fire from Iron Kettle 1 Charcoal Fire of a Suicide 1 Bleaching Nuts 7 taking fire, children playing with fire, stoves, &c.), it is year to year. General laws obtain as much in small as in great eight persons daily drop their letters into the post without directing broken heads and limbs received into the hospitals–and here the leaping out of a spark, or the dropping of a smouldering pipe of will arise from "a monkey upsetting a clotheshorse," but we have no that its rapid introduction of late years into private houses has been the insurance offices look upon with terror, especially those who make one of the largest fire-offices, speaking broadly, informed us that ten thousand pounds!_ In the foregoing list we see in how many ways Children playing with lucifers 45 Jackdaw playing with lucifers 1 127 One hundred and twenty-seven known fires thus arise from this single agency of cats and dogs were owing to their having thrown down boxes .

getting around

home

adv.search

site map



Current spider themes

news archive

 

Licence of article: GNU FDL.
Original source @ wikipedia.