CJK Unified Ideographs

CJKV ideograph 次 in traditional and simplified Chinese, Korean, Vietnamese and Japanese

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. In the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 14.0, Unicode defines a total of 92,865 CJK Unified Ideographs.^[1]

The terms ideographs or ideograms may be misleading, since the Chinese script is not strictly a pictographic or ideographic system.

Historically, Vietnam used Chinese ideographs too, so sometimes the abbreviation CJKV is used. This system was replaced by the Latin-based Vietnamese alphabet in the 1920s.

CJK Unified Ideographs blocks[]

CJK Unified Ideographs[]

The basic block named CJK Unified Ideographs (4E00–9FFF) contains 20,992 basic Chinese characters in the range U+4E00 through U+9FFF. The block not only includes characters used in the Chinese writing system but also kanji used in the Japanese writing system and hanja, whose use is diminishing in Korea. Many characters in this block are used in all three writing systems, while others are in only one or two of the three. Chữ Hán are also used in Vietnam's chữ Nôm (now obsolete). The first 20,902 characters in the block are arranged according to the Kangxi Dictionary ordering of radicals. In this system the characters written with the fewest strokes are listed first. The remaining characters were added later, and so are not in radical order.

The block is the result of Han unification,^[2] which was somewhat controversial within East Asia.^[3] Since Chinese, Japanese and Korean characters were coded in the same location, the appearance of a selected glyph could depend on the particular font being used. However, the source separation rule states that characters encoded separately in an earlier character set would remain separate in the new Unicode encoding.^[4]

Using variation selectors, it is possible to specify certain variant CJK ideograms within Unicode. The Adobe-Japan1 character set, which has 14,683 ideographic variation sequences,^[5] is an extreme example of the use of variation selectors.^[6]

Charts[]

4E00-62FF, 6300-77FF, 7800-8CFF, 8D00-9FFF.

Sources[]

Note: Most characters appear in multiple sources, making the sum of individual character counts (102,698) far more than the number of encoded characters (20,992).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	G0	GB 2312-80	6,763	20,841
	G1	GB 12345-90	2,202
	G3	GB 7589-87 traditional form	4,834
	G5	GB 7590-87 traditional form	2,841
	G7	Modern Chinese general character chart (Simplified Chinese: 现代汉语通用字表)	42
	G8	GB 8565-88	199
	GCE	National Academy for Educational Research	4
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security, People's Republic of China	2
	GE	GB16500-95	3,772
	GFC	Modern Chinese Standard Dictionary (现代汉语规范词典第二版)	2
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	1
	GH	GB/T 15564-1995	59
	GHZ	Hanyu Dazidian Ideographs (漢語大字典)	1
	GHZR	汉语大字典（第二版）	1
	GK	GB 12052-89	89
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	16
	GKX	Kangxi Dictionary Ideographs (康熙字典)	3
	GLK	龍龕手鑑	1
	GT	Standard Telegraph Codebook (revised), 1983	8
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	1
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	2,292	15,376
	HB0	Computer Chinese Glyph and Character Code Mapping Table, Technical Report C-26 (電腦用中文字型與字碼對照表, 技術通報C-26)	9
	HB1	Big-5, Level 1	5,401
	HB2	Big-5, Level 2	7,650
	HD	Hong Kong Supplementary Character Set, 2016	24
Japan	J0	JIS X 0208-1990	6,356	12,565
	J1	JIS X 0212-1990	3,058
	J13	JIS X 0213:2004 level-3 characters replacing J1 characters	1,037
	J13A	JIS X 0213:2004 level-3 character addendum from JIS X 0213:2000 level-3 replacing J1 character	2
	J14	JIS X 0213:2004 level-4 characters replacing J1 characters	1,704
	J3	JIS X 0213:2004 Level 3	95
	J3A	JIS X 0213:2004 Level 3 addendum	7
	J4	JIS X 0213:2004 Level 4	301
	JARIB	ARIB STD-B24	3
	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	2
Macau	MA	HKSCS-2008	29	200
	MB1	Big Five	10
	MB2	Big Five	7
	MC	MCSCS Reference	3
	MD	MCSCS horizontal extensions	127
	MDH	MCSCS horizontal extensions	24
North Korea	KP0	KPS 9566-97	4,652	15,011
North Korea	KP1	KPS 10721-2000	10,359	15,011
South Korea	K0	KS C 5601-87 (now KS X 1001:2004)	4,620	15,440
	K1	KS C 5657-91 (now KS X 1002:2001)	2,855
	K2	PKS C 5700-1:1994	7,911
	K3	PKS C 5700-2:1994	1
	K4	PKS 5700-3:1998	4
	K6	KS X 1027-5:2014	49
Taiwan	T1	CNS 11643-1992 plane 1	5,413	18,383
	T2	CNS 11643-1992 plane 2	7,650
	T3	CNS 11643-1992 plane 3	4,144
	T4	CNS 11643-1992 plane 4	894
	T5	CNS 11643-1992 plane 5	64
	T6	CNS 11643-1992 plane 6	31
	T7	CNS 11643-1992 plane 7	16
	TB	CNS 11643-1992 plane 11	2
	TC	CNS 11643-1992 plane 12	2
	TE	CNS 11643-1992 plane 14	9
	TF	CNS 11643-1992 plane 15	158
Vietnam	V0	TCVN 5773-1993	599	4,806
	V1	TCVN 6056:1995	3,307
	V2	VHN 01-1998	759
	V3	VHN 02-1998	91
	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	19
	VN	Vietnamese horizontal extensions	31
n/a	UTC	UTC sources	76	76

In Unicode 4.1, 14 HKSCS-2004 characters and 8 GB 18030 characters were assigned to between U+9FA6 and U+9FBB code points. Since then, other additions were added to this block for various reasons, all summarized in the version history section below.

CJK Unified Ideographs Extension A[]

The block named CJK Unified Ideographs Extension A (3400–4DBF) contains 6,592 additional characters in the range U+3400 through U+4DBF.

Charts[]

3400-4DBF.

Sources[]

Note: Most characters appear in more than one source, making the sum of individual character counts (18,828) far more than the number of encoded characters (6,592).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	G3	GB 7589-87 traditional form	2,391	6,197
	G5	GB 7590-87 traditional form	1,226
	G7	Modern Chinese general character chart	120
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	2
	GHZ	Hanyu Dazidian Ideographs (漢語大字典)	340
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	3
	GKX	Kangxi Dictionary Ideographs (康熙字典)	1,889
	GS	Singapore Chinese characters	226
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	572	572
Japan	J3	JIS X 0213:2004 Level 3	2	738
	J4	JIS X 0213:2004 Level 4	78
	JA	Japanese IT Vendors Contemporary Ideographs, 1993	574
	JA3	JIS X 0213:2004 level-3 characters replacing JA characters	17
	JA4	JIS X 0213:2004 level-4 characters replacing JA characters	67
Macau	MA	HKSCS-2008	4	12
Macau	MD	MCSCS horizontal extensions	8	12
North Korea	KP0	KPS 9566-97	1	3,189
North Korea	KP1	KPS 10721-2000	3,188	3,189
South Korea	K3	PKS C 5700-2:1994	1,833	1,863
	K4	PKS 5700-3:1998	2
	K6	KS X 1027-5:2014	28
Taiwan	T3	CNS 11643-1992 plane 3	2,179	5,916
	T4	CNS 11643-1992 plane 4	2,919
	T5	CNS 11643-1992 plane 5	399
	T6	CNS 11643-1992 plane 6	200
	T7	CNS 11643-1992 plane 7	133
	TE	CNS 11643-1992 plane 14	1
	TF	CNS 11643-1992 plane 15	85
United Kingdom	UK	IRG N2107R2	2	2
Vietnam	V0	TCVN 5773-1993	140	319
	V2	VHN 01-1998	149
	V3	VHN 02-1998	19
	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	5
	VN	Vietnamese horizontal extensions	6
n/a	UTC	UTC sources	20	20

CJK Unified Ideographs Extension B[]

The block named CJK Unified Ideographs Extension B (20000–2A6DF) contains 42,720 characters in the range U+20000 through U+2A6DF. These include most of the characters used in the Kangxi Dictionary that are not in the basic CJK Unified Ideographs block, as well as many Hán-Nôm characters that were formerly used to write Vietnamese.

Charts[]

20000-215FF, 21600-230FF, 23100-245FF, 24600-260FF, 26100-275FF, 27600-290FF, 29100-2A6DF.

Sources[]

Note: Many characters appear in more than one source, making the sum of individual character counts (74,136) far more than the number of encoded characters (42,720).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	G3	GB 7589-87 traditional form	1	30,498
	G4K	Siku Quanshu	477
	GBK	Encyclopedia of China	86
	GCH	Ci Hai (辞海)	247
	GCY	Ci Yuan (辭源)	66
	GFZ	Founder Press System	65
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	5
	GHC	Hanyu Dacidian (漢語大詞典)	553
	GHF	漢文佛典疑難俗字彙釋與研究	1
	GHZ	Hanyu Dazidian Ideographs (漢語大字典)	10,508
	GHZR	汉语大字典（第二版）	1
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	17
	GKX	Kangxi Dictionary Ideographs (康熙字典)	18,471
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	1,703	1,703
Japan	J3	JIS X 0213:2004 Level 3	25	303
	J3A	JIS X 0213:2004 Level 3 addendum	1
	J4	JIS X 0213:2004 Level 4	277
Macau	MA	HKSCS-2008	9	38
	MC	MCSCS Reference	2
	MD	MCSCS horizontal extensions	27
North Korea	KP1	KPS 10721-2000	5,766	5,766
South Korea	K1	KS C 5657-91 (now KS X 1002:2001)	1	247
	K4	PKS 5700-3:1998	166
	K6	KS X 1027-5:2014	80
Taiwan	T3	CNS 11643-1992 plane 3	25	30,192
	T4	CNS 11643-1992 plane 4	3,408
	T5	CNS 11643-1992 plane 5	8,111
	T6	CNS 11643-1992 plane 6	5,934
	T7	CNS 11643-1992 plane 7	6,299
	TA	化學命名原則(第四版) (Chemical Nomenclature: 4th Edition)	7
	TB	CNS 11643-1992 plane 11	6
	TC	CNS 11643-1992 plane 12	1
	TF	CNS 11643-1992 plane 15	6,401
United Kingdom	UK	IRG N2107R2	13	13
Vietnam	V0	TCVN 5773-1993	1,570	5,296
	V2	VHN 01-1998	2,287
	V3	VHN 02-1998	422
	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	33
	VN	Vietnamese horizontal extensions	984
n/a	SAT	SAT Daizōkyō Text Database	1	80
n/a	UTC	UTC sources	79	80

CJK Unified Ideographs Extension C[]

The block named CJK Unified Ideographs Extension C (2A700–2B73F) contains 4,153 characters in the range U+2A700 through U+2B738. It was initially added in Unicode 5.2 (2009).

Charts[]

2A700-2B73F.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (4,565) more than the number of encoded characters (4,153).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	GBK	Encyclopedia of China	74	1,130
	GCH	Ci Hai (辞海)	264
	GCY	Ci Yuan (辭源)	1
	GCYY	Chinese Academy of Surveying and Mapping ideographs	55
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security, People's Republic of China	1
	GFZ	Founder Press System	1
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	2
	GGH	Gudai Hanyu Cidian (古代汉语词典)	51
	GHC	Hanyu Dacidian (漢語大詞典)	14
	GHZ	Hanyu Dazidian Ideographs (漢語大字典)	1
	GHZR	汉语大字典（第二版）	1
	GJZ	Commercial Press ideographs	61
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	6
	GKX	Kangxi Dictionary Ideographs (康熙字典)	6
	GXC	Xiandai Hanyu Cidian (现代汉语词典)	25
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	202
	GZJW	Collections of Bronze Inscriptions from Yin and Zhou Dynasties (殷周金文集成引得)	365
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	1	1
Japan	JK	Japanese Kokuji Collection	367	367
Macau	MC	MCSCS Reference	16	20
Macau	MD	MCSCS horizontal extensions	4	20
North Korea	KP1	KPS 10721-2000	8	8
South Korea	K5	Korean IRG Hanja Character Set	404	405
South Korea	K6	KS X 1027-5:2014	1	405
Taiwan	T5	CNS 11643-1992 plane 5	634	1,751
	TC	CNS 11643-1992 plane 12	634
	TD	CNS 11643-1992 plane 13	766
	TE	CNS 11643-1992 plane 14	350
United Kingdom	UK	IRG N2107R2	1	1
Vietnam	V0	TCVN 5773-1993	4	794
	V1	TCVN 6056:1995	1
	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	782
	VN	Vietnamese horizontal extensions	6
n/a	UTC	UTC sources	88	88

CJK Unified Ideographs Extension D[]

The block named CJK Unified Ideographs Extension D (2B740–2B81F) contains 222 characters in the range U+2B740 through U+2B81D that were added in Unicode 6.0 (2010).

Charts[]

2B740–2B81F.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (229) more than the number of encoded characters (222).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	GCH	Ci Hai (辞海)	1	78
	GIDC	ID System of the Ministry of Public Security of China	32
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	2
	GXC	Xiandai Hanyu Cidian (现代汉语词典)	4
	GZH	ZhongHua ZiHai (中华字海)	39
Japan	JH	Hanyo-Denshi Program (汎用電子情報交換環境整備プログラム)	107	107
Taiwan	TB	CNS 11643-1992 plane 11	24	24
n/a	UTC	UTC sources	20	20

CJK Unified Ideographs Extension E[]

The block named CJK Unified Ideographs Extension E (2B820–2CEAF) contains 5,762 characters in the range U+2B820 through U+2CEA1 that were added in Unicode 8.0 (2015).

Charts[]

2B820–2CEAF.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (5,819) more than the number of encoded characters (5,762).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	GBK	Encyclopedia of China	15	2,820
	GCH	Ci Hai (辞海)	112
	GCY	Ci Yuan (辭源)	3
	GCYY	Chinese Academy of Surveying and Mapping ideographs	98
	GDZ	Geology Press ideographs	1
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	4
	GGH	Gudai Hanyu Cidian (古代汉语词典)	175
	GHC	Hanyu Dacidian (漢語大詞典)	7
	GIDC	ID System of the Ministry of Public Security of China	36
	GJZ	Commercial Press ideographs	147
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	2
	GKX	Kangxi Dictionary Ideographs (康��字典)	22
	GRM	People's Daily ideographs	3
	GWZ	Hanyu Da Cidian Press ideographs	12
	GXC	Xiandai Hanyu Cidian (现代汉语词典)	57
	GXH	Xinhua Zidian (新华字典)	4
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	712
	GZJW	Collections of Bronze Inscriptions from Yin and Zhou Dynasties (殷周金文集成引得)	1,410
Japan	JK	Japanese Kokuji Collection	415	415
Macau	MC	MCSCS Reference	48	51
Macau	MD	MCSCS horizontal extensions	3	51
Taiwan	T3	CNS 11643-1992 plane 3	2	1,260
	TB	CNS 11643-1992 plane 11	1
	TC	CNS 11643-1992 plane 12	323
	TD	CNS 11643-1992 plane 13	595
	TE	CNS 11643-1992 plane 14	339
United Kingdom	UK	IRG N2107R2	2	2
Vietnam	V0	TCVN 5773-1993	6	1,035
	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	1,027
	VN	Vietnamese horizontal extensions	6
n/a	UCI	UTC sources	236	236

CJK Unified Ideographs Extension F[]

The block named CJK Unified Ideographs Extension F (2CEB0–2EBEF) contains 7,473 characters in the range U+2CEB0 through 2EBE0 that were added in Unicode 10.0 (2017). It includes more than 1,000 Sawndip characters for Zhuang.

Charts[]

2CEB0–2EBEF.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (7,755) more than the number of encoded characters (7,473).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	GCY	Ci Yuan (辭源)	122	1,309
	GFC	Modern Chinese Standard Dictionary (现代汉语规范词��第二版)	27
	GIDC	ID System of the Ministry of Public Security of China	1
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	5
	GLGYJ	Zhuang Liao Songs Research (壮族嘹歌研究)	1
	GOCD	Oxford English-Chinese Chinese-English Dictionary (牛津英汉汉英词典)	2
	GPGLG	Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌)	70
	GXHZ	Xinhua Da Zidian (新华大字典)	51
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	995
	GZJW	Collections of Bronze Inscriptions from Yin and Zhou Dynasties (殷周金文集成引得)	33
	GZYS	Chinese Ancient Ethnic Characters Research (中国民族古文字研究)	2
Japan	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	1,645	1,645
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	1,793	1,793
Macau	MC	MCSCS Reference	22	22
Taiwan	T3	CNS 11643-1992 plane 3	1	3
	T6	CNS 11643-1992 plane 6	1
	TC	CNS 11643-1992 plane 12	1
United Kingdom	UK	IRG N2107R2	2	2
Vietnam	V0	TCVN 5773-1993	1	17
	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	8
	VN	Vietnamese horizontal extensions	8
n/a	SAT	SAT Daizōkyō Text Database	2,884	2,964
n/a	UTC	UTC sources	80	2,964

CJK Unified Ideographs Extension G[]

A block named CJK Unified Ideographs Extension G was added as part of Unicode 13.0 to the Tertiary Ideographic Plane in the range U+30000 through U+3134F, containing 4,939 characters.^[9]

Charts[]

30000–3134F.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (5,074) more than the number of encoded characters (4,939).^[7]

Country or region	Code	Source^[8]	Character count	Total
China	GHZR	汉语大字典（第二版）	878	2,082
	GPGLG	Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌)	13
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	1,191
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	428	428
Taiwan	T13	TCA-CNS 11643 19th plane (pending new version)	347	353
	TB	CNS 11643-1992 plane 11	3
	TC	CNS 11643-1992 plane 12	2
	TD	CNS 11643-1992 plane 13	1
United Kingdom	UK	IRG N2107R2	1,566	1,566
Vietnam	V4	Dictionary on Nom (Từ điển chữ Nôm) Dictionary on Nom of Tay ethnic (Từ điển chữ Nôm Tày) Lookup Table for Nom in the South (Bảng tra chữ Nôm miền Nam)	6	76
Vietnam	VN	Vietnamese horizontal extensions	70	76
n/a	SAT	SAT Daizōkyō Text Database	329	569
n/a	UTC	UTC sources	240	569

CJK Compatibility Ideographs[]

The block named CJK Compatibility Ideographs (F900–FAFF) was created to retain round-trip compatibility with other standards. Only twelve of its characters have the "Unified Ideograph" property: U+FA0E, FA0F, FA11, FA13, FA14, FA1F, FA21, FA23, FA24, FA27, FA28 and FA29.^[1] None of the other characters in this and other "Compatibility" blocks relate to CJK Unification.

Charts[]

F900–FAFF.

Sources[]

Note: Some characters appear in more than one source, making the sum of individual character counts (24) more than the number of encoded Unified characters (12).^[7]

Country or region	Code	Source^[8]	Character count	Total
Japan	J3	JIS X 0213:2004 Level 3	3	8
	J4	JIS X 0213:2004 Level 4	3
	JA	Japanese IT Vendors Contemporary Ideographs, 1993	1
	JA3	JIS X 0213:2004 level-3 characters replacing JA characters	1
Taiwan	TF	CNS 11643-1992 plane 15	1	1
Vietnam	V0	TCVN 5773-1993	3	3
n/a	UTC	UTC sources	12	12

UTC Sources[]

The Ideographic Research Group (IRG) bears the formal responsibility of developing extensions to the encoded repertoires of unified CJK ideographs. The Unicode Consortium participates in this group as a liaison member of ISO. The characters submitted by the Unicode Technical Committee bear the prefix "UTC". All CJK Unified Ideographs in ISO/IEC10646 are required to have at least one source identifier. Changes to IRG source information, however, can leave a given ideograph without any such sources. In such cases, the ideograph is included in the U-source database to guarantee it has at least one source. Such ideographs are indicated by a source prefix of "UCI" instead of "UTC".^[10]

The UTC sources consist of the following:

ABC Chinese-English Dictionary by John DeFrancis
The Adobe-CNS1 glyph collection
The Adobe-Japan1 glyph collection
A Complete Checklist of Species and Subspecies of Chinese Birds (中国鸟类系统检索)
The Great Nom Dictionary (Đại Tự Điển Chữ Nôm)
Annotations to Shuowen Jiezi (annotated by Duan Yucai)
GB18030-2000
Required Character List Supplied by The Church of Jesus Christ of Latter-day Saints (Hong Kong)
New Commercial Dictionary (商务新词典), Hong Kong
Defect reports filed against the Unicode Standard or other direct communication with the Unicode editorial committee
Unicode Technical Committee (UTC) documents
Modern Chinese Dictionary (现代汉语词典), by Chinese Academy of Social Sciences, Linguistics Research Institute, Dictionary Editorial Office
Working Group (WG2) documents
Wenlin (文林) http://www.wenlin.com/

Known issues[]

Disunification[]

U+4039[]

The character U+4039 (䀹) was a unification of two different characters (one with jiā 夾 phonetic and one with shǎn 㚒 phonetic) until Unicode 5.0. However, they were lexically different characters that should not have been unified; they have different pronunciations and different meanings.

The proposal of disunification of U+4039^[11] was accepted and the new character is encoded at U+9FC3 (鿃) in Unicode 5.1.^{[clarification needed]}

Other 3 glyphs in Extension B[]

In CJK Unified Ideographs Extension B, some characters are incorrectly unified with others. These characters include U+2017B (

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]