What are the system Transliterators available with ICU4J?
Author: Deron Eriksson
Description: This Java tutorial shows how to list the system transliterators available with ICU4J.
Tutorial created using:
Windows XP || JDK 1.6.0_24 || Eclipse Java EE IDE for Web Developers, Indigo
The ICU4J library makes it possible to perform a wide variety of text conversions from one format to another. Transliterator objects are used to perform the various conversions. The Transliterator.getAvailableIDs() method returns an Enumeration of Strings representing the IDs of the available Transliterators. The ID describes what type of conversion is performed by the individual Transliterator. The String ID is passed to the Transliterator.getInstance() method in order to retrieve the Transliterator in question. System Transliterators are the Transliterators that come shipped with ICU4J. You can also register user Transliterators. If you haven't registered any user Transliterators, the getAvailableIDs method will return only system Transliterators. The following code snippet will display all of the Transliterator IDs to the console. Enumeration<String> availableIDs = Transliterator.getAvailableIDs(); while (availableIDs.hasMoreElements()) { System.out.println(availableIDs.nextElement()); } The console output is shown below for version 4.8.1.1 of the icu4j library. Console output displaying the ICU4J system TransliteratorsASCII-Latin Accents-Any Amharic-Latin/BGN Any-Accents Any-Publishing Arabic-Latin Arabic-Latin/BGN Armenian-Latin Armenian-Latin/BGN Azerbaijani-Latin/BGN Belarusian-Latin/BGN Bengali-Devanagari Bengali-Gujarati Bengali-Gurmukhi Bengali-Kannada Bengali-Latin Bengali-Malayalam Bengali-Oriya Bengali-Tamil Bengali-Telugu Bopomofo-Latin Bulgarian-Latin/BGN Cyrillic-Latin Devanagari-Bengali Devanagari-Gujarati Devanagari-Gurmukhi Devanagari-Kannada Devanagari-Latin Devanagari-Malayalam Devanagari-Oriya Devanagari-Tamil Devanagari-Telugu Digit-Tone Fullwidth-Halfwidth Georgian-Latin Georgian-Latin/BGN Greek-Latin Greek-Latin/BGN Greek-Latin/UNGEGN Gujarati-Bengali Gujarati-Devanagari Gujarati-Gurmukhi Gujarati-Kannada Gujarati-Latin Gujarati-Malayalam Gujarati-Oriya Gujarati-Tamil Gujarati-Telugu Gurmukhi-Bengali Gurmukhi-Devanagari Gurmukhi-Gujarati Gurmukhi-Kannada Gurmukhi-Latin Gurmukhi-Malayalam Gurmukhi-Oriya Gurmukhi-Tamil Gurmukhi-Telugu Halfwidth-Fullwidth Han-Latin Han-Latin/Names Hangul-Latin Hebrew-Latin Hebrew-Latin/BGN Hiragana-Katakana Hiragana-Latin IPA-XSampa Jamo-Latin JapaneseKana-Latin/BGN Kannada-Bengali Kannada-Devanagari Kannada-Gujarati Kannada-Gurmukhi Kannada-Latin Kannada-Malayalam Kannada-Oriya Kannada-Tamil Kannada-Telugu Katakana-Hiragana Katakana-Latin Kazakh-Latin/BGN Kirghiz-Latin/BGN Korean-Latin/BGN Latin-ASCII Latin-Arabic Latin-Armenian Latin-Bengali Latin-Bopomofo Latin-Cyrillic Latin-Devanagari Latin-Georgian Latin-Greek Latin-Greek/UNGEGN Latin-Gujarati Latin-Gurmukhi Latin-Han Latin-Hangul Latin-Hebrew Latin-Hiragana Latin-Jamo Latin-Kannada Latin-Katakana Latin-Malayalam Latin-NumericPinyin Latin-Oriya Latin-Syriac Latin-Tamil Latin-Telugu Latin-Thaana Latin-Thai Macedonian-Latin/BGN Malayalam-Bengali Malayalam-Devanagari Malayalam-Gujarati Malayalam-Gurmukhi Malayalam-Kannada Malayalam-Latin Malayalam-Oriya Malayalam-Tamil Malayalam-Telugu Maldivian-Latin/BGN Mongolian-Latin/BGN NumericPinyin-Latin NumericPinyin-Pinyin Oriya-Bengali Oriya-Devanagari Oriya-Gujarati Oriya-Gurmukhi Oriya-Kannada Oriya-Latin Oriya-Malayalam Oriya-Tamil Oriya-Telugu Pashto-Latin/BGN Persian-Latin/BGN Pinyin-NumericPinyin Publishing-Any Russian-Latin/BGN Serbian-Latin/BGN Simplified-Traditional Syriac-Latin Tamil-Bengali Tamil-Devanagari Tamil-Gujarati Tamil-Gurmukhi Tamil-Kannada Tamil-Latin Tamil-Malayalam Tamil-Oriya Tamil-Telugu Telugu-Bengali Telugu-Devanagari Telugu-Gujarati Telugu-Gurmukhi Telugu-Kannada Telugu-Latin Telugu-Malayalam Telugu-Oriya Telugu-Tamil Thaana-Latin Thai-Latin Tone-Digit Traditional-Simplified Turkmen-Latin/BGN Ukrainian-Latin/BGN Uzbek-Latin/BGN XSampa-IPA cs-cs_FONIPA cs-ja cs-ko cs_FONIPA-ja cs_FONIPA-ko es-am es-es_FONIPA es-ja es-zh es_419-ja es_419-zh es_FONIPA-am es_FONIPA-es_419_FONIPA es_FONIPA-ja es_FONIPA-zh it-am it-ja ja_Latn-ko ja_Latn-ru pl-ja pl-pl_FONIPA pl_FONIPA-ja ro-ja ro-ro_FONIPA ro_FONIPA-ja ru-ja ru-zh sk-ja sk-sk_FONIPA sk_FONIPA-ja zh_Latn_PINYIN-ru Any-Null Any-Remove Any-Hex/Unicode Any-Hex/Java Any-Hex/C Any-Hex/XML Any-Hex/XML10 Any-Hex/Perl Any-Hex Hex-Any/Unicode Hex-Any/Java Hex-Any/C Hex-Any/XML Hex-Any/XML10 Hex-Any/Perl Hex-Any Any-Lower Any-Upper Any-Title Any-CaseFold Any-Name Name-Any Any-NFC Any-NFD Any-NFKC Any-NFKD Any-FCD Any-FCC Any-Latin Any-Latin/Names Any-Latin/BGN Any-zh Any-am Any-es_419_FONIPA Any-ja Any-Katakana Any-ru Any-sk_FONIPA Any-cs_FONIPA Any-ko Any-Telugu Any-Oriya Any-Gurmukhi Any-Devanagari Any-Malayalam Any-Bengali Any-Tamil Any-Kannada Any-pl_FONIPA Any-Hiragana Any-ro_FONIPA Any-Gujarati Any-Latin/UNGEGN Any-Hangul Any-Han Any-Arabic Any-Syriac Any-Hebrew Any-Thai Any-Cyrillic Any-Georgian Any-Armenian Any-Greek Any-Greek/UNGEGN Any-Bopomofo Any-Thaana Any-es_FONIPAIn another tutorial, you can see how to use the "Han-Latin" Transliterator to convert Chinese characters to their Latin character equivalents. Here is the mavenSW dependency for the ICU4J library used in this tutorial: Maven Dependency for ICU4J<dependency> <groupId>com.ibm.icu</groupId> <artifactId>icu4j</artifactId> <version>4.8.1.1</version> </dependency> Related Tutorials: |