Skip Header
Teaching and Learning with Technology
Computing With Accents and Foreign Scripts
TLT Home : TLT Suggestions Skip Menu
[an error occurred while processing this directive]

Encoding on the Internet

3: Expanded 8-Bit Encoding

Previous Page | Next Page

8-Bit Encoding

To increase the number of characters encoded, vendors doubled the range of ASCII to 256 (28) characters. This became known as "8-bit encoding". The usual structure is:

Crucially, each combination of an a letter plus a different accent forms a separate character or code point. For instance, á, â, à, Á, Â, and À are assigned six different numbers in 8-bit encoding.

Top of Page

Vendor Differences

Windows 1252 vs. Mac-Roman

Unfortunately, not all vendors used the same 8-bit encoding. The biggest difference was that older Windows computers use Windows-1252 while  pre-OS X Macintosh uses MacRoman encoding. As a result not all characters are assigned to the same points, plus not all the same characters can be found in both encodings.

For instance, in the chart below character #128 is (euro) in Windows 1252, but Ä (A-umlaut) in Mac Roman. Similarly the ¥ (yen) character is #165 in Windows-1252, but #180 in MacRoman.

NOTE: Today both Windows and OS X use Unicode, but differences persist due to issues of compatibility with older documents and software. The older the software, the more likely compatibility problems will occur.

Windows-1252 vs. Mac Roman Chart

Chart of Windows-1252 vs. MacRoman

NOTE: The chart above was generated with the Excel function char(#) in both Mac and Windows. Not all code points have been checked for accuracy.

Additional Reference charts are available from Kosta Kostis.

NOTE: Some charts may list the decimal number (base-10) as well the hexadecimal (base-16) number and octal (base-8) number. In most cases, you would refer to the decimal number.

Top of Page | Encoding Tutorial Index

Previous Page  Next Page

©Penn State University, 2000-2009.
This Web page maintained by Teaching and Learning with Technology, a unit of Information Technology Services. For questions or comments on this Web page, please contact Elizabeth J. Pyatt (ejp10@psu.edu).
Unicode character names and hexadecimal entity codes are taken from the public Unicode Character Charts.

Last Modified: Wednesday, 01-Aug-2007 13:32:48 EDT