Encoding on the Internet
Although 256 characters can support most Western European languages, it is not enough to handle non-Roman characters or languages with non-standard Roman characters. Therefore, other 8-bit encodings were developed for languages outside Western Europe.
To accommodate both English and the other script, many 8-bit encodings are structured as follows:
| Script | Encoding | #0-127 | #128-255 |
|---|---|---|---|
| Arabic | ISO-8859-6* |
ASCII
|
Arabic
|
| Greek | ISO-8859-7* |
ASCII
|
Greek
|
| Hebrew | ISO-8859-8* |
ASCII
|
Hebrew
|
| Russian/Cyrillic | ISO-8859-5* (rarely used) |
ASCII
|
Russian
|
*External links to charts by Matts Tande.
On the Internet, if you switch the encoding View of your browser (View » Character Set/Encoding) for an English site, in most cases, you will still see English because the encoding supports it.
Because non-Roman encodings include ASCII, if you switch to a properly encoded font in word-processor font and begin to type, you will see English characters. It is not until you switch your keyboard, that the non-Roman letters appear.
For many scripts, there is a competing Windows encoding standard and a non-Windows standard, typically one registered at the ISO as an ISO-8859-x set. For instance Hebrew Web pages can be encoded as either ISO-8859-8 ("Visual Hebrew") or as Windows-1255.
| Script | ISO/Other | Windows Encoding |
|---|---|---|
| Arabic | ISO-8859-6 | Windows-1256 |
| Greek | ISO-8859-7 ("ELOT") | Windows-1253 |
| Hebrew | ISO-8859-8 ("Visual Hebrew") | Windows-1255 |
| Russian/Cyrillic | KOI-8 | Windows-1251 |
| Central Europe | ISO-8859-2 ("Latin 2") | Windows-1250 |
If you develop in FrontPage for Windows, your Web page (even English) will be automatically encoded in the Windows Standard unless you specify otherwise (sometimes you cannot).
These are links which show the specifications for different encoding systems and the languages they are associated with. However, most languages can also be encoded as Unicode (utf-8).
NOTE: "C.P." (Codepage) is the same as "Windows". For instance CP1252 is Windows-1252.
Top of Page | Encoding Tutorial Index
