What are the system Transliterators available with ICU4J?
Author: Deron Eriksson
Description: This Java tutorial shows how to list the system transliterators available with ICU4J.
Tutorial created using: Windows XP || JDK 1.6.0_24 || Eclipse Java EE IDE for Web Developers, Indigo


The ICU4J library makes it possible to perform a wide variety of text conversions from one format to another. Transliterator objects are used to perform the various conversions. The Transliterator.getAvailableIDs() method returns an Enumeration of Strings representing the IDs of the available Transliterators. The ID describes what type of conversion is performed by the individual Transliterator. The String ID is passed to the Transliterator.getInstance() method in order to retrieve the Transliterator in question.

System Transliterators are the Transliterators that come shipped with ICU4J. You can also register user Transliterators. If you haven't registered any user Transliterators, the getAvailableIDs method will return only system Transliterators.

The following code snippet will display all of the Transliterator IDs to the console.


Enumeration<String> availableIDs = Transliterator.getAvailableIDs();
while (availableIDs.hasMoreElements()) {
	System.out.println(availableIDs.nextElement());
}

The console output is shown below for version 4.8.1.1 of the icu4j library.

Console output displaying the ICU4J system Transliterators

ASCII-Latin
Accents-Any
Amharic-Latin/BGN
Any-Accents
Any-Publishing
Arabic-Latin
Arabic-Latin/BGN
Armenian-Latin
Armenian-Latin/BGN
Azerbaijani-Latin/BGN
Belarusian-Latin/BGN
Bengali-Devanagari
Bengali-Gujarati
Bengali-Gurmukhi
Bengali-Kannada
Bengali-Latin
Bengali-Malayalam
Bengali-Oriya
Bengali-Tamil
Bengali-Telugu
Bopomofo-Latin
Bulgarian-Latin/BGN
Cyrillic-Latin
Devanagari-Bengali
Devanagari-Gujarati
Devanagari-Gurmukhi
Devanagari-Kannada
Devanagari-Latin
Devanagari-Malayalam
Devanagari-Oriya
Devanagari-Tamil
Devanagari-Telugu
Digit-Tone
Fullwidth-Halfwidth
Georgian-Latin
Georgian-Latin/BGN
Greek-Latin
Greek-Latin/BGN
Greek-Latin/UNGEGN
Gujarati-Bengali
Gujarati-Devanagari
Gujarati-Gurmukhi
Gujarati-Kannada
Gujarati-Latin
Gujarati-Malayalam
Gujarati-Oriya
Gujarati-Tamil
Gujarati-Telugu
Gurmukhi-Bengali
Gurmukhi-Devanagari
Gurmukhi-Gujarati
Gurmukhi-Kannada
Gurmukhi-Latin
Gurmukhi-Malayalam
Gurmukhi-Oriya
Gurmukhi-Tamil
Gurmukhi-Telugu
Halfwidth-Fullwidth
Han-Latin
Han-Latin/Names
Hangul-Latin
Hebrew-Latin
Hebrew-Latin/BGN
Hiragana-Katakana
Hiragana-Latin
IPA-XSampa
Jamo-Latin
JapaneseKana-Latin/BGN
Kannada-Bengali
Kannada-Devanagari
Kannada-Gujarati
Kannada-Gurmukhi
Kannada-Latin
Kannada-Malayalam
Kannada-Oriya
Kannada-Tamil
Kannada-Telugu
Katakana-Hiragana
Katakana-Latin
Kazakh-Latin/BGN
Kirghiz-Latin/BGN
Korean-Latin/BGN
Latin-ASCII
Latin-Arabic
Latin-Armenian
Latin-Bengali
Latin-Bopomofo
Latin-Cyrillic
Latin-Devanagari
Latin-Georgian
Latin-Greek
Latin-Greek/UNGEGN
Latin-Gujarati
Latin-Gurmukhi
Latin-Han
Latin-Hangul
Latin-Hebrew
Latin-Hiragana
Latin-Jamo
Latin-Kannada
Latin-Katakana
Latin-Malayalam
Latin-NumericPinyin
Latin-Oriya
Latin-Syriac
Latin-Tamil
Latin-Telugu
Latin-Thaana
Latin-Thai
Macedonian-Latin/BGN
Malayalam-Bengali
Malayalam-Devanagari
Malayalam-Gujarati
Malayalam-Gurmukhi
Malayalam-Kannada
Malayalam-Latin
Malayalam-Oriya
Malayalam-Tamil
Malayalam-Telugu
Maldivian-Latin/BGN
Mongolian-Latin/BGN
NumericPinyin-Latin
NumericPinyin-Pinyin
Oriya-Bengali
Oriya-Devanagari
Oriya-Gujarati
Oriya-Gurmukhi
Oriya-Kannada
Oriya-Latin
Oriya-Malayalam
Oriya-Tamil
Oriya-Telugu
Pashto-Latin/BGN
Persian-Latin/BGN
Pinyin-NumericPinyin
Publishing-Any
Russian-Latin/BGN
Serbian-Latin/BGN
Simplified-Traditional
Syriac-Latin
Tamil-Bengali
Tamil-Devanagari
Tamil-Gujarati
Tamil-Gurmukhi
Tamil-Kannada
Tamil-Latin
Tamil-Malayalam
Tamil-Oriya
Tamil-Telugu
Telugu-Bengali
Telugu-Devanagari
Telugu-Gujarati
Telugu-Gurmukhi
Telugu-Kannada
Telugu-Latin
Telugu-Malayalam
Telugu-Oriya
Telugu-Tamil
Thaana-Latin
Thai-Latin
Tone-Digit
Traditional-Simplified
Turkmen-Latin/BGN
Ukrainian-Latin/BGN
Uzbek-Latin/BGN
XSampa-IPA
cs-cs_FONIPA
cs-ja
cs-ko
cs_FONIPA-ja
cs_FONIPA-ko
es-am
es-es_FONIPA
es-ja
es-zh
es_419-ja
es_419-zh
es_FONIPA-am
es_FONIPA-es_419_FONIPA
es_FONIPA-ja
es_FONIPA-zh
it-am
it-ja
ja_Latn-ko
ja_Latn-ru
pl-ja
pl-pl_FONIPA
pl_FONIPA-ja
ro-ja
ro-ro_FONIPA
ro_FONIPA-ja
ru-ja
ru-zh
sk-ja
sk-sk_FONIPA
sk_FONIPA-ja
zh_Latn_PINYIN-ru
Any-Null
Any-Remove
Any-Hex/Unicode
Any-Hex/Java
Any-Hex/C
Any-Hex/XML
Any-Hex/XML10
Any-Hex/Perl
Any-Hex
Hex-Any/Unicode
Hex-Any/Java
Hex-Any/C
Hex-Any/XML
Hex-Any/XML10
Hex-Any/Perl
Hex-Any
Any-Lower
Any-Upper
Any-Title
Any-CaseFold
Any-Name
Name-Any
Any-NFC
Any-NFD
Any-NFKC
Any-NFKD
Any-FCD
Any-FCC
Any-Latin
Any-Latin/Names
Any-Latin/BGN
Any-zh
Any-am
Any-es_419_FONIPA
Any-ja
Any-Katakana
Any-ru
Any-sk_FONIPA
Any-cs_FONIPA
Any-ko
Any-Telugu
Any-Oriya
Any-Gurmukhi
Any-Devanagari
Any-Malayalam
Any-Bengali
Any-Tamil
Any-Kannada
Any-pl_FONIPA
Any-Hiragana
Any-ro_FONIPA
Any-Gujarati
Any-Latin/UNGEGN
Any-Hangul
Any-Han
Any-Arabic
Any-Syriac
Any-Hebrew
Any-Thai
Any-Cyrillic
Any-Georgian
Any-Armenian
Any-Greek
Any-Greek/UNGEGN
Any-Bopomofo
Any-Thaana
Any-es_FONIPA
In another tutorial, you can see how to use the "Han-Latin" Transliterator to convert Chinese characters to their Latin character equivalents.


Here is the mavenSW dependency for the ICU4J library used in this tutorial:

Maven Dependency for ICU4J


<dependency>
	<groupId>com.ibm.icu</groupId>
	<artifactId>icu4j</artifactId>
	<version>4.8.1.1</version>
</dependency>