CJK Unified Ideographs

From Wikipedia, the free encyclopedia
CJKV ideograph in traditional and simplified Chinese, Korean, Vietnamese and Japanese

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. In the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 14.0, Unicode defines a total of 92,865 CJK Unified Ideographs.[1]

The terms ideographs or ideograms may be misleading, since the Chinese script is not strictly a pictographic or ideographic system.

Historically, Vietnam used Chinese ideographs too, so sometimes the abbreviation CJKV is used. This system was replaced by the Latin-based Vietnamese alphabet in the 1920s.

CJK Unified Ideographs blocks[]

CJK Unified Ideographs[]

The basic block named CJK Unified Ideographs (4E00–9FFF) contains 20,992 basic Chinese characters in the range U+4E00 through U+9FFF. The block not only includes characters used in the Chinese writing system but also kanji used in the Japanese writing system and hanja, whose use is diminishing in Korea. Many characters in this block are used in all three writing systems, while others are in only one or two of the three. Chữ Hán are also used in Vietnam's chữ Nôm (now obsolete). The first 20,902 characters in the block are arranged according to the Kangxi Dictionary ordering of radicals. In this system the characters written with the fewest strokes are listed first. The remaining characters were added later, and so are not in radical order.

The block is the result of Han unification,[2] which was somewhat controversial within East Asia.[3] Since Chinese, Japanese and Korean characters were coded in the same location, the appearance of a selected glyph could depend on the particular font being used. However, the source separation rule states that characters encoded separately in an earlier character set would remain separate in the new Unicode encoding.[4]

Using variation selectors, it is possible to specify certain variant CJK ideograms within Unicode. The Adobe-Japan1 character set, which has 14,683 ideographic variation sequences,[5] is an extreme example of the use of variation selectors.[6]

Charts[]

4E00-62FF, 6300-77FF, 7800-8CFF, 8D00-9FFF.

Sources[]

Note: Most characters appear in multiple sources, making the sum of individual character counts (102,698) far more than the number of encoded characters (20,992).[7]

Country or region Code Source[8] Character count Total
 China G0 GB 2312-80 6,763 20,841
G1 GB 12345-90 2,202
G3 GB 7589-87 traditional form 4,834
G5 GB 7590-87 traditional form 2,841
G7 Modern Chinese general character chart (Simplified Chinese: 现代汉语通用字表) 42
G8 GB 8565-88 199
GCE National Academy for Educational Research 4
GDM Place name characters from the Public Order Administration, Ministry of Public Security, People's Republic of China 2
GE GB16500-95 3,772
GFC Modern Chinese Standard Dictionary (现代汉语规范词典第二版) 2
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 1
GH GB/T 15564-1995 59
GHZ Hanyu Dazidian Ideographs (漢語大字典) 1
GHZR 汉语大字典(第二版) 1
GK GB 12052-89 89
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 16
GKX Kangxi Dictionary Ideographs (康熙字典) 3
GLK 龍龕手鑑 1
GT Standard Telegraph Codebook (revised), 1983 8
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 1
 Hong Kong H Hong Kong Supplementary Character Set, 2008 2,292 15,376
HB0 Computer Chinese Glyph and Character Code Mapping Table, Technical Report C-26
(電腦用中文字型與字碼對照表, 技術通報C-26)
9
HB1 Big-5, Level 1 5,401
HB2 Big-5, Level 2 7,650
HD Hong Kong Supplementary Character Set, 2016 24
 Japan J0 JIS X 0208-1990 6,356 12,565
J1 JIS X 0212-1990 3,058
J13 JIS X 0213:2004 level-3 characters replacing J1 characters 1,037
J13A JIS X 0213:2004 level-3 character addendum from JIS X 0213:2000 level-3 replacing J1 character 2
J14 JIS X 0213:2004 level-4 characters replacing J1 characters 1,704
J3 JIS X 0213:2004 Level 3 95
J3A JIS X 0213:2004 Level 3 addendum 7
J4 JIS X 0213:2004 Level 4 301
JARIB ARIB STD-B24 3
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 2
 Macau MA HKSCS-2008 29 200
MB1 Big Five 10
MB2 Big Five 7
MC MCSCS Reference 3
MD MCSCS horizontal extensions 127
MDH MCSCS horizontal extensions 24
 North Korea KP0 KPS 9566-97 4,652 15,011
KP1 KPS 10721-2000 10,359
 South Korea K0 KS C 5601-87 (now KS X 1001:2004) 4,620 15,440
K1 KS C 5657-91 (now KS X 1002:2001) 2,855
K2 PKS C 5700-1:1994 7,911
K3 PKS C 5700-2:1994 1
K4 PKS 5700-3:1998 4
K6 KS X 1027-5:2014 49
 Taiwan T1 CNS 11643-1992 plane 1 5,413 18,383
T2 CNS 11643-1992 plane 2 7,650
T3 CNS 11643-1992 plane 3 4,144
T4 CNS 11643-1992 plane 4 894
T5 CNS 11643-1992 plane 5 64
T6 CNS 11643-1992 plane 6 31
T7 CNS 11643-1992 plane 7 16
TB CNS 11643-1992 plane 11 2
TC CNS 11643-1992 plane 12 2
TE CNS 11643-1992 plane 14 9
TF CNS 11643-1992 plane 15 158
 Vietnam V0 TCVN 5773-1993 599 4,806
V1 TCVN 6056:1995 3,307
V2 VHN 01-1998 759
V3 VHN 02-1998 91
V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
19
VN Vietnamese horizontal extensions 31
n/a UTC UTC sources 76 76

In Unicode 4.1, 14 HKSCS-2004 characters and 8 GB 18030 characters were assigned to between U+9FA6 and U+9FBB code points. Since then, other additions were added to this block for various reasons, all summarized in the version history section below.

CJK Unified Ideographs Extension A[]

The block named CJK Unified Ideographs Extension A (3400–4DBF) contains 6,592 additional characters in the range U+3400 through U+4DBF.

Charts[]

3400-4DBF.

Sources[]

Note: Most characters appear in more than one source, making the sum of individual character counts (18,828) far more than the number of encoded characters (6,592).[7]

Country or region Code Source[8] Character count Total
 China G3 GB 7589-87 traditional form 2,391 6,197
G5 GB 7590-87 traditional form 1,226
G7 Modern Chinese general character chart 120
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 2
GHZ Hanyu Dazidian Ideographs (漢語大字典) 340
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 3
GKX Kangxi Dictionary Ideographs (康熙字典) 1,889
GS Singapore Chinese characters 226
 Hong Kong H Hong Kong Supplementary Character Set, 2008 572 572
 Japan J3 JIS X 0213:2004 Level 3 2 738
J4 JIS X 0213:2004 Level 4 78
JA Japanese IT Vendors Contemporary Ideographs, 1993 574
JA3 JIS X 0213:2004 level-3 characters replacing JA characters 17
JA4 JIS X 0213:2004 level-4 characters replacing JA characters 67
 Macau MA HKSCS-2008 4 12
MD MCSCS horizontal extensions 8
 North Korea KP0 KPS 9566-97 1 3,189
KP1 KPS 10721-2000 3,188
 South Korea K3 PKS C 5700-2:1994 1,833 1,863
K4 PKS 5700-3:1998 2
K6 KS X 1027-5:2014 28
 Taiwan T3 CNS 11643-1992 plane 3 2,179 5,916
T4 CNS 11643-1992 plane 4 2,919
T5 CNS 11643-1992 plane 5 399
T6 CNS 11643-1992 plane 6 200
T7 CNS 11643-1992 plane 7 133
TE CNS 11643-1992 plane 14 1
TF CNS 11643-1992 plane 15 85
 United Kingdom UK IRG N2107R2 2 2
 Vietnam V0 TCVN 5773-1993 140 319
V2 VHN 01-1998 149
V3 VHN 02-1998 19
V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
5
VN Vietnamese horizontal extensions 6
n/a UTC UTC sources 20 20

CJK Unified Ideographs Extension B[]

The block named CJK Unified Ideographs Extension B (20000–2A6DF) contains 42,720 characters in the range U+20000 through U+2A6DF. These include most of the characters used in the Kangxi Dictionary that are not in the basic CJK Unified Ideographs block, as well as many Hán-Nôm characters that were formerly used to write Vietnamese.

Charts[]

20000-215FF, 21600-230FF, 23100-245FF, 24600-260FF, 26100-275FF, 27600-290FF, 29100-2A6DF.

Sources[]

Note: Many characters appear in more than one source, making the sum of individual character counts (74,136) far more than the number of encoded characters (42,720).[7]

Country or region Code Source[8] Character count Total
 China G3 GB 7589-87 traditional form 1 30,498
G4K Siku Quanshu 477
GBK Encyclopedia of China 86
GCH Ci Hai (辞海) 247
GCY Ci Yuan (辭源) 66
GFZ Founder Press System 65
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 5
GHC Hanyu Dacidian (漢語大詞典) 553
GHF 漢文佛典疑難俗字彙釋與研究 1
GHZ Hanyu Dazidian Ideographs (漢語大字典) 10,508
GHZR 汉语大字典(第二版) 1
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 17
GKX Kangxi Dictionary Ideographs (康熙字典) 18,471
 Hong Kong H Hong Kong Supplementary Character Set, 2008 1,703 1,703
 Japan J3 JIS X 0213:2004 Level 3 25 303
J3A JIS X 0213:2004 Level 3 addendum 1
J4 JIS X 0213:2004 Level 4 277
 Macau MA HKSCS-2008 9 38
MC MCSCS Reference 2
MD MCSCS horizontal extensions 27
 North Korea KP1 KPS 10721-2000 5,766 5,766
 South Korea K1 KS C 5657-91 (now KS X 1002:2001) 1 247
K4 PKS 5700-3:1998 166
K6 KS X 1027-5:2014 80
 Taiwan T3 CNS 11643-1992 plane 3 25 30,192
T4 CNS 11643-1992 plane 4 3,408
T5 CNS 11643-1992 plane 5 8,111
T6 CNS 11643-1992 plane 6 5,934
T7 CNS 11643-1992 plane 7 6,299
TA 化學命名原則(第四版) (Chemical Nomenclature: 4th Edition) 7
TB CNS 11643-1992 plane 11 6
TC CNS 11643-1992 plane 12 1
TF CNS 11643-1992 plane 15 6,401
 United Kingdom UK IRG N2107R2 13 13
 Vietnam V0 TCVN 5773-1993 1,570 5,296
V2 VHN 01-1998 2,287
V3 VHN 02-1998 422
V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
33
VN Vietnamese horizontal extensions 984
n/a SAT SAT Daizōkyō Text Database 1 80
UTC UTC sources 79

CJK Unified Ideographs Extension C[]

The block named CJK Unified Ideographs Extension C (2A700–2B73F) contains 4,153 characters in the range U+2A700 through U+2B738. It was initially added in Unicode 5.2 (2009).

Charts[]

2A700-2B73F.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (4,565) more than the number of encoded characters (4,153).[7]

Country or region Code Source[8] Character count Total
 China GBK Encyclopedia of China 74 1,130
GCH Ci Hai (辞海) 264
GCY Ci Yuan (辭源) 1
GCYY Chinese Academy of Surveying and Mapping ideographs 55
GDM Place name characters from the Public Order Administration, Ministry of Public Security, People's Republic of China 1
GFZ Founder Press System 1
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 2
GGH Gudai Hanyu Cidian (古代汉语词典) 51
GHC Hanyu Dacidian (漢語大詞典) 14
GHZ Hanyu Dazidian Ideographs (漢語大字典) 1
GHZR 汉语大字典(第二版) 1
GJZ Commercial Press ideographs 61
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 6
GKX Kangxi Dictionary Ideographs (康熙字典) 6
GXC Xiandai Hanyu Cidian (现代汉语词典) 25
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 202
GZJW Collections of Bronze Inscriptions from Yin and Zhou Dynasties
(殷周金文集成引得)
365
 Hong Kong H Hong Kong Supplementary Character Set, 2008 1 1
 Japan JK Japanese Kokuji Collection 367 367
 Macau MC MCSCS Reference 16 20
MD MCSCS horizontal extensions 4
 North Korea KP1 KPS 10721-2000 8 8
 South Korea K5 Korean IRG Hanja Character Set 404 405
K6 KS X 1027-5:2014 1
 Taiwan T5 CNS 11643-1992 plane 5 634 1,751
TC CNS 11643-1992 plane 12 634
TD CNS 11643-1992 plane 13 766
TE CNS 11643-1992 plane 14 350
 United Kingdom UK IRG N2107R2 1 1
 Vietnam V0 TCVN 5773-1993 4 794
V1 TCVN 6056:1995 1
V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
782
VN Vietnamese horizontal extensions 6
n/a UTC UTC sources 88 88

CJK Unified Ideographs Extension D[]

The block named CJK Unified Ideographs Extension D (2B740–2B81F) contains 222 characters in the range U+2B740 through U+2B81D that were added in Unicode 6.0 (2010).

Charts[]

2B740–2B81F.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (229) more than the number of encoded characters (222).[7]

Country or region Code Source[8] Character count Total
 China GCH Ci Hai (辞海) 1 78
GIDC ID System of the Ministry of Public Security of China 32
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 2
GXC Xiandai Hanyu Cidian (现代汉语词典) 4
GZH ZhongHua ZiHai (中华字海) 39
 Japan JH Hanyo-Denshi Program (汎用電子情報交換環境整備プログラム) 107 107
 Taiwan TB CNS 11643-1992 plane 11 24 24
n/a UTC UTC sources 20 20

CJK Unified Ideographs Extension E[]

The block named CJK Unified Ideographs Extension E (2B820–2CEAF) contains 5,762 characters in the range U+2B820 through U+2CEA1 that were added in Unicode 8.0 (2015).

Charts[]

2B820–2CEAF.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (5,819) more than the number of encoded characters (5,762).[7]

Country or region Code Source[8] Character count Total
 China GBK Encyclopedia of China 15 2,820
GCH Ci Hai (辞海) 112
GCY Ci Yuan (辭源) 3
GCYY Chinese Academy of Surveying and Mapping ideographs 98
GDZ Geology Press ideographs 1
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 4
GGH Gudai Hanyu Cidian (古代汉语词典) 175
GHC Hanyu Dacidian (漢語大詞典) 7
GIDC ID System of the Ministry of Public Security of China 36
GJZ Commercial Press ideographs 147
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 2
GKX Kangxi Dictionary Ideographs (康���字典) 22
GRM People's Daily ideographs 3
GWZ Hanyu Da Cidian Press ideographs 12
GXC Xiandai Hanyu Cidian (现代汉语词典) 57
GXH Xinhua Zidian (新华字典) 4
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 712
GZJW Collections of Bronze Inscriptions from Yin and Zhou Dynasties
(殷周金文集成引得)
1,410
 Japan JK Japanese Kokuji Collection 415 415
 Macau MC MCSCS Reference 48 51
MD MCSCS horizontal extensions 3
 Taiwan T3 CNS 11643-1992 plane 3 2 1,260
TB CNS 11643-1992 plane 11 1
TC CNS 11643-1992 plane 12 323
TD CNS 11643-1992 plane 13 595
TE CNS 11643-1992 plane 14 339
 United Kingdom UK IRG N2107R2 2 2
 Vietnam V0 TCVN 5773-1993 6 1,035
V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
1,027
VN Vietnamese horizontal extensions 6
n/a UCI UTC sources 236 236

CJK Unified Ideographs Extension F[]

The block named CJK Unified Ideographs Extension F (2CEB0–2EBEF) contains 7,473 characters in the range U+2CEB0 through 2EBE0 that were added in Unicode 10.0 (2017). It includes more than 1,000 Sawndip characters for Zhuang.

Charts[]

2CEB0–2EBEF.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (7,755) more than the number of encoded characters (7,473).[7]

Country or region Code Source[8] Character count Total
 China GCY Ci Yuan (辭源) 122 1,309
GFC Modern Chinese Standard Dictionary (现代汉语规范词��第二版) 27
GIDC ID System of the Ministry of Public Security of China 1
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 5
GLGYJ Zhuang Liao Songs Research (壮族嘹歌研究) 1
GOCD Oxford English-Chinese Chinese-English Dictionary (牛津英汉汉英词典) 2
GPGLG Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌) 70
GXHZ Xinhua Da Zidian (新华大字典) 51
GZ Ancient Zhuang Character Dictionary (古壮字字典) 995
GZJW Collections of Bronze Inscriptions from Yin and Zhou Dynasties
(殷周金文集成引得)
33
GZYS Chinese Ancient Ethnic Characters Research (中国民族古文字研究) 2
 Japan JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 1,645 1,645
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 1,793 1,793
 Macau MC MCSCS Reference 22 22
 Taiwan T3 CNS 11643-1992 plane 3 1 3
T6 CNS 11643-1992 plane 6 1
TC CNS 11643-1992 plane 12 1
 United Kingdom UK IRG N2107R2 2 2
 Vietnam V0 TCVN 5773-1993 1 17
V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
8
VN Vietnamese horizontal extensions 8
n/a SAT SAT Daizōkyō Text Database 2,884 2,964
UTC UTC sources 80

CJK Unified Ideographs Extension G[]

A block named CJK Unified Ideographs Extension G was added as part of Unicode 13.0 to the Tertiary Ideographic Plane in the range U+30000 through U+3134F, containing 4,939 characters.[9]

Charts[]

30000–3134F.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (5,074) more than the number of encoded characters (4,939).[7]

Country or region Code Source[8] Character count Total
 China GHZR 汉语大字典(第二版) 878 2,082
GPGLG Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌) 13
GZ Ancient Zhuang Character Dictionary (古壮字字典) 1,191
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 428 428
 Taiwan T13 TCA-CNS 11643 19th plane (pending new version) 347 353
TB CNS 11643-1992 plane 11 3
TC CNS 11643-1992 plane 12 2
TD CNS 11643-1992 plane 13 1
 United Kingdom UK IRG N2107R2 1,566 1,566
 Vietnam V4 Dictionary on Nom (Từ điển chữ Nôm)
Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày)
Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)
6 76
VN Vietnamese horizontal extensions 70
n/a SAT SAT Daizōkyō Text Database 329 569
UTC UTC sources 240

CJK Compatibility Ideographs[]

The block named CJK Compatibility Ideographs (F900–FAFF) was created to retain round-trip compatibility with other standards. Only twelve of its characters have the "Unified Ideograph" property: U+FA0E, FA0F, FA11, FA13, FA14, FA1F, FA21, FA23, FA24, FA27, FA28 and FA29.[1] None of the other characters in this and other "Compatibility" blocks relate to CJK Unification.

Charts[]

F900–FAFF.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (24) more than the number of encoded Unified characters (12).[7]

Country or region Code Source[8] Character count Total
 Japan J3 JIS X 0213:2004 Level 3 3 8
J4 JIS X 0213:2004 Level 4 3
JA Japanese IT Vendors Contemporary Ideographs, 1993 1
JA3 JIS X 0213:2004 level-3 characters replacing JA characters 1
 Taiwan TF CNS 11643-1992 plane 15 1 1
 Vietnam V0 TCVN 5773-1993 3 3
n/a UTC UTC sources 12 12

UTC Sources
[]

The Ideographic Research Group (IRG) bears the formal responsibility of developing extensions to the encoded repertoires of unified CJK ideographs. The Unicode Consortium participates in this group as a liaison member of ISO. The characters submitted by the Unicode Technical Committee bear the prefix "UTC". All CJK Unified Ideographs in ISO/IEC10646 are required to have at least one source identifier. Changes to IRG source information, however, can leave a given ideograph without any such sources. In such cases, the ideograph is included in the U-source database to guarantee it has at least one source. Such ideographs are indicated by a source prefix of "UCI" instead of "UTC".[10]

The UTC sources consist of the following:

  • ABC Chinese-English Dictionary by John DeFrancis
  • The Adobe-CNS1 glyph collection
  • The Adobe-Japan1 glyph collection
  • A Complete Checklist of Species and Subspecies of Chinese Birds (中国鸟类系统检索)
  • The Great Nom Dictionary (Đại Tự Điển Chữ Nôm)
  • Annotations to Shuowen Jiezi (annotated by Duan Yucai)
  • GB18030-2000
  • Required Character List Supplied by The Church of Jesus Christ of Latter-day Saints (Hong Kong)
  • New Commercial Dictionary (商务新词典), Hong Kong
  • Defect reports filed against the Unicode Standard or other direct communication with the Unicode editorial committee
  • Unicode Technical Committee (UTC) documents
  • Modern Chinese Dictionary (现代汉语词典), by Chinese Academy of Social Sciences, Linguistics Research Institute, Dictionary Editorial Office
  • Working Group (WG2) documents
  • Wenlin (文林) http://www.wenlin.com/

Known issues[]

Disunification[]

U+4039[]

The character U+4039 (䀹) was a unification of two different characters (one with jiā 夾 phonetic and one with shǎn 㚒 phonetic) until Unicode 5.0. However, they were lexically different characters that should not have been unified; they have different pronunciations and different meanings.

The proposal of disunification of U+4039[11] was accepted and the new character is encoded at U+9FC3 (鿃) in Unicode 5.1.[clarification needed]

Other 3 glyphs in Extension B[]

In CJK Unified Ideographs Extension B, some characters are incorrectly unified with others. These characters include U+2017B (