ISO/IEC 8859-9

From Wikipedia, the free encyclopedia
ISO/IEC 8859-9
MIME / IANAISO-8859-9
Alias(es)iso-ir-148, latin5, l5, csISOLatin5[1]
StandardTS 5881, ECMA-128, ISO/IEC 8859
ClassificationISO 8859 (extended ASCII, ISO 4873 level 1)
ExtendsUS-ASCII
Based onISO/IEC 8859-1
Preceded byISO/IEC 8859-3
Other related encoding(s)Windows-1254

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is designated ECMA-128 by Ecma International and TS 5881 as a Turkish standard.[2] It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language, designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for these six replacements of Icelandic characters with characters unique to the Turkish alphabet:

Position 0xD0 0xDD 0xDE 0xF0 0xFD 0xFE
8859-9 Ğ İ Ş ğ ı ş
8859-1 Ð Ý Þ ð ý þ

ISO-8859-9 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. In modern applications Unicode and UTF-8 are preferred; authors of new web pages and the designers of new protocols are instructed to use UTF-8 instead.[3] Since August 2019, 0.1% of all web pages use ISO-8859-9,[4][5] while 3.1% of web pages located in Turkey use ISO-8859-9.[6] However, the WHATWG Encoding Standard, which specifies the character encodings which are permitted in HTML5 and which compliant browsers must support,[7] requires that web pages marked as ISO-8859-9 be handled as Windows-1254,[3] which differs from ISO-8859-9 by using the CR range which ISO-8859-9 reserves for C1 control codes for additional graphical characters instead (analogous to the relationship between ISO-8859-1 and Windows-1252).

Microsoft has assigned code page 28599 a.k.a. Windows-28599 to ISO-8859-9 in Windows. IBM has assigned code page 920 (CCSID 920) to ISO-8859-9.[8][9] It is published by Ecma International as ECMA-128.[10]

Codepage layout[]

Differences from ISO-8859-1 have the Unicode code point number below the character.

ISO/IEC 8859-9[11][12][13]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ğ
011E
Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü İ
0130
Ş
015E
ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ğ
011F
ñ ò ó ô õ ö ÷ ø ù ú û ü ı
0131
ş
015F
ÿ

See also[]

  • Windows-1254

References[]

  1. ^ Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. ^ "Latin-5: A list of the Latin-5 client and server CCSIDs, which includes Turkey". IBM. Archived from the original on 2022-02-13.
  3. ^ a b van Kesteren, Anne. "Names and labels". Encoding Standard. WHATWG.
  4. ^ "Historical trends in the usage of character encodings for websites". w3techs.com.
  5. ^ "Frequently Asked Questions". w3techs.com.
  6. ^ "Distribution of character encodings among websites that use Turkey". w3techs.com.
  7. ^ "8.2.2.3. Character encodings". HTML 5.1 2nd Edition. W3C. User agents must support the encodings defined in the WHATWG Encoding standard, including, but not limited to […]
  8. ^ "Code page 920 information document". Archived from the original on 2017-01-16.
  9. ^ "CCSID 920 information document". Archived from the original on 2016-03-27.
  10. ^ Standard ECMA-128: 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabet No. 5 (2nd ed.). 1999. This Ecma publication is also approved as ISO 8859-9.
  11. ^ Code Page CPGID 00920 (pdf) (PDF), IBM
  12. ^ Code Page CPGID 00920 (txt), IBM
  13. ^ International Components for Unicode (ICU), ibm-920_P100-1995.ucm, 2002-12-03

External links[]

Retrieved from ""