Unicode input

From Wikipedia, the free encyclopedia
The KCharSelect character mapping tool shown displaying a subset of the Unicode Mathematical Operators
The Unicode logo

Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set (which it contains), Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.[1][better source needed]

A Unicode input system must provide for a large repertoire of characters, ideally all valid Unicode code points. This is different from a keyboard layout which defines keys and their combinations only for a limited number of characters appropriate for a certain locale.

Unicode numbers[]

Unicode characters are distinguished by code points, which are conventionally represented by "U+" followed by four, five or six hexadecimal digits, for example U+00AE or U+1D310. Characters in the Basic Multilingual Plane (BMP), containing modern scripts – including many Chinese and Japanese characters – and many symbols, have a 4-digit code. Historic scripts, but also many modern symbols and pictographs (such as emoticons, emojis, playing cards and many CJK characters) have 5-digit codes.

Availability[]

An application can display a character only if it can access a font which contains a glyph for the character.[2] Very few fonts have full Unicode coverage; most only contain the glyphs needed to support a few writing systems. However, most modern browsers and other text-processing applications are able to display multilingual content because they perform font substitution, automatically switching to a fallback font when necessary to display characters which are not supported in the current font. Which fonts are used for fallback and the thoroughness of Unicode coverage varies by software and operating system; some software will search for a suitable glyph in all of the installed fonts, others only search within certain fonts.

If an application does not have access to a glyph, the character will usually be shown as the font's ".notdef" glyph ⟨