Encoding on the Internet
Although 256 characters can support most Western European languages, it is not enough to handle non-Roman characters or languages with non-standard Roman characters. Therefore, other 8-bit encodings were developed for languages outside Western Europe.
To accommodate both English and the other script, many 8-bit encodings are structured as follows:
| Script | Encoding | #0-127 | #128-255 |
|---|---|---|---|
| Arabic | ISO-8859-6* |
ASCII
|
Arabic
|
| Greek | ISO-8859-7* |
ASCII
|
Greek
|
| Hebrew | ISO-8859-8* |
ASCII
|
Hebrew
|
| Russian/Cyrillic | ISO-8859-5* (rarely used) |
ASCII
|
Russian
|
*External links to charts by Matts Tande.
On the Internet, if you switch the encoding View of your browser (View » Character Set/Encoding) for an English site, in most cases, you will still see English because the encoding supports it.
Because non-Roman encodings include ASCII, if you switch to a properly encoded font in word-processor font and begin to type, you will see English characters. It is not until you switch your keyboard, that the non-Roman letters appear.
For many scripts, there is a competing Windows encoding standard and a non-Windows standard, typically one registered at the ISO as an ISO-8859-x set. For instance Hebrew Web pages can be encoded as either ISO-8859-8 ("Visual Hebrew") or as Windows-1255.
| Script | ISO/Other | Windows Encoding |
|---|---|---|
| Arabic | ISO-8859-6 | Windows-1256 |
| Greek | ISO-8859-7 ("ELOT") | Windows-1253 |
| Hebrew | ISO-8859-8 ("Visual Hebrew") | Windows-1255 |
| Russian/Cyrillic | KOI-8 | Windows-1251 |
| Central Europe | ISO-8859-2 ("Latin 2") | Windows-1250 |
If you develop in FrontPage for Windows, your Web page (even English) will be automatically encoded in the Windows Standard unless you specify otherwise (sometimes you cannot).
Top of Page | Encoding Tutorial Index
