This chapter reviews the concepts of internationalization (I18N). For more detailed information, we recommend Chapter 11 of the OSF Motif Programmer's Guide, Release 1.2 . For additional documentation on the underlying X and Xt support for localized applications, refer to Chapters 10 and 11 of the XLib Programming Manual from O'Reilly & Associates, Inc.
Note: This chapter assumes that you are using Motif 1.2 or later. Earlier versions do not support locales.
Internationalizing Your Application
Internationalization (I18N) is the development of applications that can be run in different language environments (Japanese, Spanish, English, and so forth) without code revision or recompilation. The code is free of dependencies on language, character set, or special data representations (for example, currency and time formats).
Note: Not everything can be localized. Source code elements such as identifiers, resource names, and instance names must remain C-readable.
In the course of writing an application that will work in different languages, you must make decisions concerning coding, input techniques, and output methods and formats. Builder Xcessory has several features that assist you in producing an internationalized application.
Character Sets and Code Sets
The practical problem with I18N arises from the different representation methods used by languages to build their respective linguistic elements. Most especially, ideographic languages, which can contain literally thousands of individual glyphs, cannot be fully represented using standard 8-bit code sets.
X and Motif I18N Support
When X11R5 and Motif 1.2 were released, they contained new data types, function calls, fonts, and modifications to widgets to handle the localization of graphical user interface (GUI) applications. These additions provide a method by which you can create applications, with one set of source code, that can run in any one of several alternative languages, depending on the locale. Although these additions enable I18N and localization, they do not facilitate them.
- How are characters encoded; how does your application deal with storage formats for dates, addresses, currency, and so on?
- How are the characters of a complex character set (for example, ideographic sets) actually entered when it is impractical to construct a physical keyboard capable of representing every one of those characters?
- How are characters of different languages displayed within the application, especially within the same text area?
Note: Builder Xcessory uses the same features supported in generated code, so you can choose to run Builder Xcessory in another locale. This allows strings to be input with that locale's input method.
Note: A Japanese version is available. Contact your ICS Sales Representative.
An I18N program must operate regardless of the encoding of the characters in the user's language. A program that ignores or truncates the eighth bit of every character (as some English-based applications do) will not work in Europe, which requires eight bits to represent accented characters. Similarly, an application that assumes that every character is eight bits long will not work in Japan, where there are many thousands of ideographic characters. In addition, you cannot assume a single character size, because Japanese commonly intermixes 16-bit Japanese characters with 8-bit Latin characters.
Xt support of internationalization is trivial in most applications: the only code required is a call to XtSetLanguageProc() just before the call to XtAppInitialize(). This one function call does all the set-up necessary for an Xt-based application. Some additional work is required if your application is to support internationalized text output or input (as explained in Input methods ).
A locale is the language environment determined by the application at run time. The X/Open Portability Guide, Issue 3 (XPG3)defines locale as a means of specifying three characteristics of a language environment that might be required for localization:
As an example of using locales, assume you have an application with two different language versions. You set yourXUSERFILESEARCHPATH (see Static output ) to the following (all of which should appear on one line):
This search path allows your application to search from the most specific designation of the locale to the least specific, allowing you a high degree of flexibility in configuring your application. See the explanation of "XtResolvePathname" in Vol. 5 of the O'Reilly & Assoc. X Toolkit Intrinsics Reference Manual for an explanation of the various substitutions and how they relate to the locale designation.
The application must have a way of recognizing the language environment in which it is running. Based on this information, the application can then make adjustments such as allowing the display of strings in the appropriate language. Builder Xcessory provides a code generation option to enable this support in your application.
Note: ViewKit applications always have localization support enabled by the VkApp object.
2. Click the Application tab on the Code Generation Preferences dialog ( Code Generation Preferences Dialog ).
Builder Xcessory inserts code into the main routine to initialize the toolkit I18N features. When the code is compiled and run, the toolkit examines the LANG environment variable to determine the current locale, and then initialize its internal routines to deal with locale-specific issues.
For example, if the user has set the LANG environment variable to
ujis , the application automatically initializes support for Japanese language character input and display, as well as monetary and date output, and so forth.
If the Initialize Localization Support toggle is not set, you can manually set the locale in your application using the ANSI C function
setlocale . The function
setlocale (LC_ALL,"") sets all locale-specific information to the default (the LANG environment variable). The function
setlocale(LC_COLLATE,"ja_JP.ujis") sets only the collation order to that of the Japanese UJIS code set.
setlocale() is called from the default language procedure installed by
XtSetLanguageProc() , described in detail in Asente, Converse, Swick's X Window System Toolkit .
XtSetLanguageProc() also initializes the toolkit internationalization techniques, connects to the input method (if necessary), and sets various defaults for the current locale, as specified by the LANG environment variable.
Note: The default language procedure can be replaced using
XtSetLanguageProc()with arguments other than NULL. Refer to O'Reilly & Associates' X Window System , Volumes 4 and 5 for a more detailed discussion of Xt language procedures.
· Use ANSI C or C++; Kernigan & Ritchie 1 implementations may not support locale-aware string manipulation.
strcoll() rather than
strcmp() for string comparisons.
strcmp() routine assumes an ASCII character set when doing string comparisons;
strcoll() has no such limitation and can deal with locale-specific character encoding and sorting order.
wchar_t , rather than
char as the type for string processing.
char type allocates a fixed size (8 or 16 bit, depending on the architecture) for a character. This is not enough to hold the characters of some locales, particularly the idiographic languages. The
wchar_t type allocates size sufficient to hold the largest character in the current locale. This can be grossly inefficient, so you should only use
wchar_t for operations that index arrays of characters.
Your application should not explicitly code any language-dependent information. 2 This includes strings, fonts, and language-dependent pixmaps. In order to do this, the Open Group (formerly OSF) suggests that these resources be placed in message catalogs, resource files or UIL files.
Builder Xcessory allows resources to be placed in resource files. Once a resource is set, you can choose (on an individual resource or resource class basis) whether that resource is set in the code or in a resource file.
Individually, this choice is made with the Resource Placement menu, to the right of the text field used to enter the resource value (seeResource Editor Placement Settings ).
Resource values can also be placed in a resource file on a type basis by setting the resource's default resource placement. In the placement window, types can be specified to be put, by default, into Code or App (resource file).
2. Scroll the dialog to find the Compound String and Font types, and set the App toggle for each (see Default Resource Type Settings).
Builder Xcessory also allows you to generate single or multiple UIL files, providing another language-independent way of specifying resource values that can be used to internationalize an application. To save resources in a UIL file, set the resource to be saved into code and when generating code, generate UIL instead of C or C++.
Note: You can only generate UIL files if generating an application in C (not in C++).
Once you generate a resource (or UIL) file, you must make copies of the file for each language supported and modify the contents accordingly. Then, using the locale of the machine and the environment variables LANG, UIDPATH, and XUSERFILESEARCHPATH, the different resource or UIL files are read and used by the application at run time.
An internationalized program must be able to display all the characters used in the user's language, and must allow the user to specify all those characters as input. When there are more characters in a language than there are keys on a keyboard, some sort of "input method" is required to convert multiple keystrokes to single characters.
An input method is a mapping between keyboard input and the text data passed to the application. Such a mapping exists even within the familiar context of ISO8859-1 where, for example, the combination of the <Ctrl> or <Alt> key and a letter translates into a letter with a special accent mark: ü, é, and so forth.
Note: Within the 7- bit ASCII characters, there are no accented characters. However, ISO8859-1 is a superset of ASCII extending the code set to 8 bits, and includes accented characters and symbols.
The concept of an input method is especially important for ideographic languages. Review Chapter 11 of the OSF/Motif Programmer's Guide , Release 1.2 for a detailed discussion of the different aspects of input methods and how they are supported by Motif.
Builder Xcessory assumes that you have access to an input method. Input methods are available in Motif 1.2. Prior to Motif 1.2 and X11R5, input methods were proprietary additions, and no standard existed. Builder Xcessory supports the use of X11R5-style input methods exclusively.
Input methods allow your users to enter text in their native language. There are several input methods available from hardware vendors and third party software vendors. The X source code distribution also includes a few sample implementations. They run as separate processes alongside the internationalized applications.
Note: Multiple input methods can run simultaneously for any number of internationalized applications.
An internationalized application displays all text in the user's chosen language. This includes prompts, error messages, and text on buttons, menus, and other widgets. The simplest approach to this requirement is to remove all strings that are to be displayed from the source code of the application and store them in a file that will be read when the application starts up. That file can then be translated into various languages, with the appropriate version being read at start-up.
In addition, an internationalized application must display times, dates, numbers, and so on, in the format that the user expects. For example, Americans expect dates in the form month/day/year, English expect day/month/year, and Germans expect day.month.year.
Most languages use words from different languages and some require the word's native character set to be used. For example, a Japanese application might require error messages with a mix of Hirigana, Katakana, and Kanji characters, as well as some technical terms that require a Latin character set. This means that in one string there could be 5 words using Kanji characters and one word using Latin characters, requiring 2 different fonts.
Note: It is possible to display characters of multiple character sets within the same output string using compound strings and font lists.
Motif uses compound strings in many widgets. A compound string is used to set labels on label and button widgets as well as the contents of lists in Motif. These compound strings hold all information related to a string, including the text, direction and font used to display the string.
The Compound String Editor allows strings to have multiple fonts and direction, and allows for a connection to an X input method. For more information on the Compound String Editor, refer to Compound String Editor .
Motif does not supply a String-to- XmString converter that understands font list tags or direction information. Builder Xcessory provides a String-to-XmString converter in the bxutils file, which supports Builder Xcessory style ASCII representations of compound strings.
A font list is a resource type that can be a single font or a font set. Font sets were introduced in X11R5 and Motif 1.2 (as the XFontSet). A font set is treated as a single entry in a font list, but contains all the fonts required to display all the characters of a locale. Internal X, Motif, and C routines are used to encode the font information for displaying a given string.
The Font List Editor in Builder Xcessory supports the specification of font sets as well as regular Fonts (see Font List Editor ). For more information on the Font List Editor, refer to Font List Editor .
Note: A font set is defined by the locale in which the program is running. For example, a font set that works in a Korean locale will not work in a Japanese locale.
Although Motif 1.2 supports multiple entry font lists containing both fonts and font sets, Motif does not supply a String-to- XmString converter that understands font list tags or direction information. Builder Xcessory provides such a converter in the bxutils file. This converter is installed in your application by a call to RegisterBxConverters(). The converter supports a special textual representation of all the information encoded in a compound string: font tag, direction, etc. This allows you to quickly create multifont and multidirectional strings and use them with the Motif widget set, even though they are not supported in Motif 1.2. This code is completely portable and OS independent.
Generating Localized Files
The following sections discuss the generation of files for an internationalized application. Static and dynamic output are considered for both UIL and C/C++ generation. In static output, strings are created when the source code is generated. In dynamic output, strings are incorporated at run time by reference to a separate source.
Static and dynamic output are handled in the same manner when generating UIL. Typically, App-defaults files are not used. Instead, the application maintains a list of messages in a separate UIL file for each locale. When the application is built, the appropriate UIL is compiled into a UID and then used.
When generating C/C++, save Strings and XmStrings, as well as other locale-specific information such as fonts and colors, in app-defaults. To do this, you can take advantage of the X environment variables XFILESEARCHPATH andXUSERFILESEARCHPATH.
If you plan to use C or C++ instead of UIL to handle string output, write your application so that dynamic output uses a message catalog. A message catalog is a method for storing and fetching strings to/from external sources.
3. Use the DBM string fetching routines, passing the tag string and a default message string as parameters. If the lookup of the message catalog yields no match for the tag string, display the default message string.