Internationalization allows software to be adapted to any language and cultural convention. During the internationalization process, the programmer isolates the parts of a program that are dependent on language and culture. For example, the programmer will isolate error messages because they must be translated during localization.
What is localization?Localization is the process of adapting a program for use in a specific locale. A locale is a geographic or political region that shares the same language and customs. Localization includes the translation of text such as GUI labels, error messages, and online help. It also includes the culture-specific formatting of data items such as monetary values, times, dates, and numbers.
How do I go about internationalizing an existing program?See the steps outlined in the Checklist section of the The Java Tutorial.
A locale is a geographic or political region
that shares the same language and customs.
In the Java programming language, a locale is represented
by a Locale
object.
Locale-sensitive operations, such as collation
and date formatting, vary according to Locale
.
Locale
objects?
See the Setting the Locale section of the The Java Tutorial.
Which locales are supported?The locales supported by the JDK software are listed at Supported Locales. A platform other than the JDK may support a different set of locales.
Can a Java application use multiple locales?Yes. This capability allows you to create multi-lingual applications.
How does setting the default locale affect the results of sorting?
The Collator
class, and its subclasses, are used for building
sorting routines. These classes are locale-sensitive,
and when created with the no-argument constructor will
use the collating sequence of the default locale.
A ResourceBundle
object
allows you to isolate localizable elements
from the rest of the application.
With all resources separated into
a bundle, the application simply loads the appropriate bundle for
the active locale. If the user switches locales, the application just loads
a different bundle.
ResourceBundle
objects?
See the Isolating Locale-Specific Data section of the The Java Tutorial.
How do I specify non-ASCII strings in a properties file?You can specify any Unicode character with the \uXXXX notation. (The XXXX denotes the 4 hexadecimal digits that comprise the Unicode value of a character.) For example, a properties file might have the following entries:
If you have edited and saved the file in a non-ASCII encoding, you can convert it to ASCII with the native2ascii tool. For example, you might want to do this when editing a properties file in Shift JIS, a popular Japanese encoding. How do I compile a non-ASCIIs1=hello there s2=\uff2d\uff33\u30b4
ListResourceBundle
?
If your source file is in a non-ASCII encoding, you can direct the compiler to convert it into Unicode. For example, you would compile a Japanese resource bundle written in the Shift JIS encoding as follows:
javac -encoding SJIS LabelsResource_ja.java
You can use the SimpleDateFormat
to format and parse dates
in a locale-sensitive manner.
See the section on formatting
Dates and Times
in the
The Java Tutorial.
This was caused by a bug that was fixed in release 1.1.6 of the JDK software.
font.properties
file?
The font.properties file maps the fonts of the host platform, such as Solaris or Win32, to Java virtual fonts. The font.properties file is in the $JAVAHOME/lib directory.
How do I add a font?See the web page Adding Fonts to the Java Runtime Environment.
Why can't I see a particular character in myTextField
and
TextArea
components?
The proper font is not installed on your platform.
I have installed a Unicode font, but my program cannot display all Unicode characters. What's the problem?The characters that cannot be displayed may not be in the font.
What font types does the JDK software support for the Win32 and Solaris platforms?The release for Win32 platforms supports TrueType fonts. The release for Solaris supports outline fonts that can be handled by an X11 server, such as F3, Type1, and TrueType.
What classes of fonts are supported by the Java Runtime Environment?Version 1.0 of the JDK software included the font names TimesRoman, Courier, and Helvetica, which were very specific and do not apply to many locales. Version 1.1 supports the following classes of fonts:
The virtual font name is the name of the font recognized by the Java Runtime Environment. The platform font name is the actual name of the font on the host platform. For example, Dialog and Serif are virtual font names, and Times and Helvetica are the platform font names on a Win32 or Solaris platform.
Is it possible to display more than one language in the Java Runtime Environment?Yes. To implement a multi-lingual display you make the necessary changes to the font.properties file and remove the language-specific font.properties.xx files. See the web page, Adding Fonts to the Java Runtime Environment, for details.
Why does my Chinese font with Big5 encoding work fine on Windows NT but not on Windows 95?
Windows NT's internal encoding is Unicode, so it can
support Unicode Chinese characters if a Big5 font is installed.
However, Windows 95 uses the ANSI codepage, which limits it to
the 8859_1 code page.
Therefore, on Windows 95 a TextArea
component won't work correctly
with Big5 encoded Chinese characters.
The default fonts are listed in the following table:
lang (locale) screen-width font-typefaces font-size font-encoding korean (ko) WIDTH > 1175 Round Gothic 18 (point) ksc5601.1987-0 korean (ko) 850<WIDTH<1176 Round Gothic 16 (point) ksc5601.1987-0 korean (ko) 851 > WIDTH Round Gothic 14 (point) ksc5601.1987-0 korean (ko.UTF-8) same as above same as above same as above ksc5601.1992-3 japanese (ja) > 1175 Gothic 16 (point) jisx0201.1976-0 japanese (ja) < 1176 Gothic 14 (point) jisx0201.1976-0 T-chinese (zh_TW) > 1175 Sung 18 (point) cns11643-[1..16] T-chinese (zh_TW) < 1176 Sung 16 (point) cns11643-[1..16] T-chinese (BIG5) > 1175 Ming 18(point) big5-1 T-chinese (BIG5) < 1176 Ming 16 (point) big5-1 S-chinese (zh) > 1175 Song 16 (point) gb2312.1980-0 S-chinese (zh) < 1176 Song 14 (point) gb2312.1980-0
A character encoding is a mapping between characters and code values.
What is a Unicode?
In the Java programming language, char
values represent Unicode
characters. Unicode is a 16-bit
character encoding that supports the world's major languages. You can
learn more about the Unicode
standard at the
Unicode Consortium
web site.
The Converting Non-Unicode Text section of the The Java Tutorial explains how to peform the conversions within an application. To convert data files use the native2ascii tool.
Which character encodings are supported when converting text to and from Unicode?See the Supported Encodings web page.
I can't find theCharToByteConverter
class.
What should I use to convert character encodings?
The CharToByteConverter
class is available
only in the sun.io
package.
If you use this package, your program will be platform-dependent.
Instead, try using the InputStreamReader
and OutputStreamReader
classes, which belong
to the java.io
package.
Yes, but this is typically done by licensees, not by application
programmers.
You'll need to extend the ByteToCharConverter
and
CharToByteConverter
classes.
See the Charset Converter section in
the
Adding Fonts to the Java Runtime
web page.
UTF8 stands for Universal Transformation Format 8. It is transmission format for Unicode that is safe for UNIX file systems.
What is a file encoding?
A file encoding is the standard used to encode character
data in a file.
A string identifying the file encoding is stored
in the file.encoding
property of the
System
class.
The file encoding is significant because
the Java programming language uses Unicode for characters,
but the file system of the host platform probably
uses some other encoding.
This encoding varies with host platform and locale.
If the encoding matches the file.encoding
property,
then the conversion of the character data into Unicode is
transparent to the programmer.
For release 1.1.7 and 1.2.0, the default file encoding is CP1252 for Win32 and ISO8859_1 for Solaris.
Are the CP1252 and ISO8859_1 encodings identical?No. CP1252 contains some additional characters in the range of \u0080 to \u009F.
The input method framework enables all text editing components to receive Japanese, Chinese, or Korean text input through input methods. An input method lets users enter thousands of different characters using keyboards with far fewer keys. Typically a sequence of several characters needs to be typed and then converted to create one or more characters. For specifications and examples see the web page, Input Method Framework.
How do you switch between Chinese and English input modes?Solaris:
A user may have multiple input methods available. For example, the user may have input methods for different languages or input methods that accept various types of input. Such a user must be able to select the input method used for a partiuclar language or the input method that provides the fastest input.
Can an input method be activated programmically?In release 1.1 of the JDK software an input method can be activated only by the user's keystrokes. The FCS of release 1.2 permits programmatic activation of an input method.
Do the AWT and Swing (JFC) text components work with input methods?See the Input Methods section of the JDK Software Internationalization Overview.
Since decomposing takes time, turning decomposition off makes
comparisons go faster. However, for Latin languages
the NO_DECOMPOSITION
mode is
not useful if the text contains accents.
You should use the default decomposition unless
you really know what you're doing.
The strength property you choose depends
on what your application is trying to accomplish.
For example, when performing a text search you
may allow a "weak" match, in which accents and differences
in case (upper vs. lower) are ignored.
This type of search employs the PRIMARY
strength.
If you are sorting a list of words, you
might want to use the TERTIARY
strength.
In this mode the properties that must match are the
base character, accent, and case.
Support for the euro currency is available in version 1.2 and later of the Java 2 platform. For information about support in release 1.1 see the web page, EURO CURRENCY PROPOSAL FOR JDK 1.1.x.
This page was updated on 5 October 1998.
Copyright © 1996-1999 Sun Microsystems, Inc. All rights reserved.