Encoding on the Internet
Although 256 characters can support most Western European languages, it is not enough to handle non-Roman characters or languages with non-standard Roman characters. Therefore, other 8-bit encodings were developed for languages outside Western Europe.
To accommodate both English and the other script, many 8-bit encodings are structured as follows:
| Script | Encoding | #0-127 | #128-255 |
|---|---|---|---|
| Arabic | ISO-8859-6* (rarely used) |
ASCII
|
Arabic
|
| Greek | ISO-8859-7* |
ASCII
|
Greek
|
| Hebrew | ISO-8859-8* |
ASCII
|
Hebrew
|
*External links to Wikipedia
On the Internet, if you switch the encoding View of your browser (View » Character Set/Encoding) for an English site, in most cases, you will still see English because the encoding supports it.
Because non-Roman encodings include ASCII, if you switch to a properly encoded font in word-processor font and begin to type, you will see English characters. It is not until you switch your keyboard, that the non-Roman letters appear.
For many scripts, there is a competing Windows encoding standard and a non-Windows standard, typically one registered at the ISO as an ISO-8859-x set. For instance Hebrew Web pages can be encoded as either ISO-8859-8 ("Visual Hebrew") or as Windows-1255.
| Script | ISO/Other | Windows Encoding |
|---|---|---|
| Arabic | ISO-8859-6 | Windows-1256 |
| Greek | ISO-8859-7 ("ELOT") | Windows-1253 |
| Hebrew | ISO-8859-8 ("Visual Hebrew") | Windows-1255 |
| Russian/Cyrillic | KOI-8 | Windows-1251 |
| Central Europe | ISO-8859-2 ("Latin 2") | Windows-1250 |
If you develop in FrontPage for Windows, your Web page (even English) will be automatically encoded in the Windows Standard unless you specify otherwise (sometimes you cannot).
Top of Page | Encoding Tutorial Index
©Penn State University, 2000-2011.
This Web page maintained by Teaching and
Learning with Technology, a unit of Information
Technology Services. For questions or comments on this Web page, please
contact Elizabeth J. Pyatt (ejp10@psu.edu).
Unicode character names and hexadecimal entity codes are taken from the public Unicode Character Charts.
Last Modified: Friday, 12-Aug-2011 17:54:25 EDT

