To display an HTML page correctly, the browser must know what character set to use.
World Wide Web using a character set early ASCII. ASCII supports the numbers 0-9, the uppercase and lowercase English alphabet, and some special characters.
Since many countries use characters are not part of ASCII, modern browser default character set is ISO-8859-1.
If a page using a different ISO-8859-1 character set, it should be in the <meta> tag specified.
ISO character set is ISO (ISO) defined for different alphabets / languages standard character set.
The following lists the different character sets used throughout the world:
|ISO-8859-1||Latin alphabet part 1||North America, Western Europe, Latin America, the Caribbean, Canada, Africa|
|ISO-8859-2||Latin alphabet part 2||Eastern Europe|
|ISO-8859-3||Latin alphabet part 3||SE Europe, Esperanto, miscellaneous|
|ISO-8859-4||Latin alphabet part 4||Scandinavia / Baltic Sea (and the other part is not included in the ISO-8859-1)|
|ISO-8859-5||Latin / Cyrillic part 5||Using the Cyrillic alphabet languages such as Bulgarian, Belarusian, Russian, Macedonian|
|ISO-8859-6||Latin / Arabic part 6||Using the Arabic alphabet languages|
|ISO-8859-7||Latin / Greek part 7||Modern Greek, as well as mathematical symbols derived from Greek|
|ISO-8859-8||Latin / Hebrew part 8||Hebrew language|
|ISO-8859-9||Latin 5 part 9||Turkish|
|ISO-8859-10||Latin 6||Lapland language, Germanic, Scandinavian Eskimo|
|ISO-8859-15||Latin 9 (aka Latin 0)||Similarly with ISO 8859-1, the euro symbol and several other characters replace some of the less frequently used symbols|
|ISO-2022-JP||Latin / Japanese part 1||Japanese|
|ISO-2022-JP-2||Latin / Japanese part 2||Japanese|
|ISO-2022-KR||Latin / Korean part 1||Korean|
Because character sets listed above have limited capacity and are not compatible in multilingual environments, the Unicode standard Unicode alliance developed.
The Unicode Standard covers all the characters, punctuation, and symbols in the world.
Whatever the platform, program or language, Unicode can be processing, storage and interchange of text data.
The Unicode Consortium developed the Unicode standard. Their goal is to use the standard Unicode Transformation Format (UTF) to replace the existing character sets.
The Unicode Consortium standard of organization and leadership development cooperation, such as ISO, W3C, and ECMA.
Unicode can be compatible with different character sets. The most common way of encoding is UTF-8 and UTF-16:
Tip: front 256 Unicode character set corresponding to character 256 in the ISO-8859-1 character.
Tip: All HTML 4 processors have support for UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16.