Internationalization (i18n)

wpadmin Uncategorized October 8, 2021

Internationalization , ‘i18n’ for short. Some Web site or service need to i18n for all world visitor. There are a lots kind of character set, sorting order, and calendars and so on. For example, in Japan the time stamp is used Chinese character and numerical character. “2021年12月23日12時34分56秒” stand for 12:34:56, 2021/12/23 (hh:mm:ss, yyyy/MM/dd).

Time format

Language	Express
Chinese	二〇二一年十二月二十三日星期一上午两点三十四分五十六秒
Japanese	2021年12月23日月曜日午前2時34分56秒
Korean	2021 년 12 월 23 일 월요일 오전 2시 34 분 56 초
English	Monday, December 23, 2021 2:34:56 am

Character Code

Single Byte Character Set, SBCS is able to express 1 character by 1 byte. It is mainly used English and European language. 1 byte character set can only represent up to 256 characters. So a lot of language , like Chinese, Japanese is not able to use Single Byte Character Set.

Byte	Count	Character Set(Language)
1	256	ASCII(English), ISO-8859(European etc), SJIS(Japanese Katakana), UTF-8(Alphabet)
2	65,536	UTF16(All language), UTF8(All language), SJIS(Japanese Chinese character, Hiragana etc), GB2312(Chinese)
3	16,777,216	UTF-8(All language)
4	4 294 967 296	UTF-32(All language), UTF16(All language), UTF8(All language)

ASCII

ASCII is one of common single byte code. Its are included Alphabet character(a-z, A-Z), digit(0-9), unreadable character(space, return code) and symbol(!,”,#,_,etc).
ASCII is not used full 256 character of 8bits, is used 128 character of 7 bits

10hex	16hex	ASCII	10hex	16hex	ASCII	10hex	16hex	ASCII
10	0A	[LF]	46	2E	.	62	3E	>
13	0D	[CR]	47	2F	/	63	3F	?
32	20	[Space]	48	30	0	64	40	@
33	21	!	49	31	1	65	41	A
34	22	“	50	32	2	66	42	B
35	23	#	51	33	3	67	43	C
36	24	$	52	34	4	68	44	D
37	25	%	53	35	5	69	45	E
38	26	&	54	36	6	70	46	F
39	27	‘	55	37	7	71	47	G
40	28	(	56	38	8	72	48	H
41	29	)	57	39	9	73	49	I
42	2A	*	58	3A	:	74	4A	J
43	2B	+	59	3B	;	75	4B	K
44	2C	,	60	3C	<	76	4C	L
45	2D	–	61	3D	=	77	4D	M

10hex	16hex	ASCII	10hex	16hex	ASCII	10hex	16hex	ASCII
78	4E	N	94	5E	^	110	6E	n
79	4F	O	95	5F	_	111	6F	o
80	50	P	96	60	`	112	70	p
81	51	Q	97	61	a	113	71	q
82	52	R	98	62	b	114	72	r
83	53	S	99	63	c	115	73	s
84	54	T	100	64	d	116	74	t
85	55	U	101	65	e	117	75	u
86	56	V	102	66	f	118	76	v
87	57	W	103	67	g	119	77	w
88	58	X	104	68	h	120	78	x
89	59	Y	105	69	i	121	79	y
90	5A	Z	106	6A	j	122	7A	z
91	5B	[	107	6B	k	123	7B	{
92	5C	\	108	6C	l	124	7C	\|
93	5D	]	109	6D	m	125	7D	}
						126	7E	~

ISO-8859-2

ISO8859 stands for 1 byte, 8-bit character encodings, there are 15 parts, such as ISO-8859-1, ISO-8859-2, ISO-8859,.. . ISO-8859-2 Supports those Central and Eastern European languages. Its use the Latin alphabet, including Bosnian, Polish, Croatian, Czech, Slovak, Slovene, Serbian, and Hungarian.

10hex	16hex	8859-2	10hex	16hex	8859-2	10hex	16hex	8859-2
10	0A	[LF]	46	2E	.	62	3E	>
13	0D	[CR]	47	2F	/	63	3F	?
32	20	[Space]	48	30	0	64	40	@
33	21	!	49	31	1	65	41	A
34	22	“	50	32	2	66	42	B
35	23	#	51	33	3	67	43	C
36	24	$	52	34	4	68	44	D
37	25	%	53	35	5	69	45	E
38	26	&	54	36	6	70	46	F
39	27	‘	55	37	7	71	47	G
40	28	(	56	38	8	72	48	H
41	29	)	57	39	9	73	49	I
42	2A	*	58	3A	:	74	4A	J
43	2B	+	59	3B	;	75	4B	K
44	2C	,	60	3C	<	76	4C	L
45	2D	–	61	3D	=	77	4D	M

10hex	16hex	8859-2	10hex	16hex	8859-2	10hex	16hex	8859-2
78	48	N	94	5E	^	110	6E	n
79	49	O	95	5F	_	111	6F	o
80	50	P	96	60	`	112	70	p
81	51	Q	97	61	a	113	71	q
82	52	R	98	62	b	114	72	r
83	53	S	99	63	c	115	73	s
84	54	T	100	64	d	116	74	t
85	55	U	101	65	e	117	75	u
86	56	V	102	66	f	118	76	v
87	57	W	103	67	g	119	77	w
88	58	X	104	68	h	120	78	x
89	59	Y	105	69	i	121	79	y
90	5A	Z	106	6A	j	122	7A	z
91	5B	[	107	6B	k	123	7B	{
92	5C	\	108	6C	l	124	7C	\|
93	5D	]	109	6D	m	125	7D	}

10hex	16hex	8859-2	10hex	16hex	8859-2	10hex	16hex	8859-2
160	A0	[NBSP]	176	B0	°	192	C0	Ŕ
161	A1	Ą	177	B1	ą	193	C1	Á
162	A2	˘	178	B2	˛	194	C2	Â
163	A3	Ł	179	B3	ł	195	C3	Ă
164	A4	¤	180	B4	´	196	C4	Ä
165	A5	Ľ	181	B5	ľ	197	C5	Ĺ
166	A6	Ś	182	B6	ś	198	C6	Ć
167	A7	§	183	B7	ˇ	199	C7	Ç
168	A8	¨	184	B8	¸	200	C8	Č
169	A9	Š	185	B9	š	201	C9	É
170	AA	Š	186	BA	ş	202	CA	Ę
171	AB	Ť	187	BB	ť	203	CB	Ë
172	AC	Ź	188	BC	ź	204	CC	Ě
173	AD	[SHY]	189	BD	˝	205	CD	Í
174	AE	Ž	190	BE	ž	206	CE	Î
175	AF	Ż	191	BF	ż	207	CF	Ď

10hex	16hex	8859-2	10hex	16hex	8859-2	10hex	16hex	8859-2
208	D0	Đ	224	E0	ŕ	240	F0	đ
209	D1	Ń	225	E1	á	241	F1	ń
210	D2	Ń	226	E2	â	242	F2	ň
211	D3	Ó	227	E3	ă	243	F3	ó
212	D4	Ô	228	E4	ä	244	F4	ô
213	D5	Ő	229	E5	ĺ	245	F5	ő
214	D6	Ö	230	E6	ć	246	F6	ö
215	D7	×	231	E7	ç	247	F7	÷
216	D8	Ř	232	E8	č	248	F8	ř
217	D9	Ů	233	E9	é	249	F9	ů
218	DA	Ú	234	EA	ę	250	FA	ú
219	DB	Ű	235	EB	ë	251	FB	ű
220	DC	Ü	236	EC	ě	252	FC	ü
221	DD	Ý	237	ED	í	253	FD	ý
222	DE	Ţ	238	EE	î	254	FE	ţ
223	DF	ß	239	EF	ď	255	FF	˙

Internationalization (i18n)

About: wpadmin

Leave a Reply Cancel reply