Scripts in this region are often called Brahmic scripts because of their origin from the Brahmi (Brāhmī) script. This set of scripts can be further subdivided into Northern and Southern Brahmic scripts.
Southern Brahmic scripts are partly distinguished by the different options for vowel placement and have traditionally not been as well supported technologically as some of the dominant Northern Brahmic scripts.
In addition, some languages are written in right-to-left scripts such as Arabic (particularly in Pakistan) or Thaana (for Divehi).
| Northern Brahmic | Southern Brahmic | Right-to-left |
|---|---|---|
|
Because of the complex placement of vowel signs for these languages, Unicode fonts are not interchangeable between platforms. OTF fonts work in Windows, but not perfectly in OS X. Apple fonts use ATSUI technology instead. It is better to use South Asian fonts from Microsoft and Apple whenever possible.
| Windows Support | Macintosh Support | Linux/Unix |
|---|---|---|
Windows XP Supports
Windows XP Service Pack Two Adds
Windows Vista Adds
|
Macintosh Supports
System 10.4 (Tiger) Adds
System 10.7 (Lions) Adds
Freeware Utilites are Available forX11 Unix Environment
|
|
Encoding: utf-8 (Unicode) , ISCII (older), ITRANS (older)
Use Unicode to develop new pages.
One option is to use Dreamweaver, Microsoft Expression or other Web editor and change the keyboard to the correct script. This will allow you to type content in directly with the appropriate script. However, it is important to verify that the correct encoding is specified in the Web page header.
Another option is to compose the basic text in an international or foreign language text editor or word processor and export the content as an HTML or text file with the appropriate encoding. This file could be opened in another HTML editor such as Dreamweaver or Microsoft Expression, and edited for formatting.
For Web tools such as Blogs at Penn State, Facebook, Twitter, del.icio.us, Flicker, and others, users can typically change the keyboard and input text. In most cases, this content will be encoded as Unicode.
For short texts, such as the yoga om sign (ॐ = ॐ), it may be desirable to use Unicode Entity codes and enter HTML entity codes.
Before the development of Unicode encoding, the government of India had developed a standard called ISCII (Indian Script Code for Information Interchange). In this standard similar characters in multiple scripts would be assigned the same character number. For instance Devanagari क (ka) and Gujarati ક (ka) would be assigned the same code point. However, most modern development is in Unicode.
Computers process text by assuming a certain encoding or a system of matching electronic data with visual text characters. Whenever you develop a Web site you need to make sure the proper encoding is specified in the header tags; otherwise the browser may default to U.S. settings and not display the text properly.
To declare an encoding, insert or inspect the following meta-tag at the top of your HTML file, then replace "???" with one of the encoding codes listed above. If you are not sure, use utf-8 as the encoding.
Generic Encoding Template
<head>
<meta http-equiv="Content-Type" content="text/html; charset=??? ">
...
<head>Declare Unicode
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8 ">
...
<head>
The final close slash must be included after the final quote mark in the encoding header tag if you are using XHTML
Declare Unicode in XHTML
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
...
<head>
If no encoding is declared, then the browser uses the default setting, which in the U.S. is typically Latin-1. In that case many Unicode characters could be displayed incorrectly. Also, older browsers such as Netscape 4.7 may not be able to process the entity codes correctly without the "utf-8" declaration.
Language tags are also suggested so that search engines and screen readers parse the language of a page. These are metadata tags which indicate the language of a page, not devices to trigger translation. Visit the Language Tag page to view information on where to insert it.
In some cases, your best options may be to use PDF files or image files. See the Web Development Tips section for more details.
These pages cover internationalization of South Asian scripts in general.
See also
©Penn State University, 2000-2011.
This Web page maintained by Teaching and
Learning with Technology, a unit of Information
Technology Services. For questions or comments on this Web page, please
contact Elizabeth J. Pyatt (ejp10@psu.edu).
Unicode character names and hexadecimal entity codes are taken from the public Unicode Character Charts.
Last Modified: Friday, 11-May-2012 18:33:11 EDT

