International Components for Unicode

International Components for Unicode

Infobox_Software
name = International Components for Unicode



caption =
developer = IBM and many other companies.
latest_release_version = 4.0
latest_release_date = release date|2008|07|02
latest_preview_version =
latest_preview_date =
operating_system = Cross-platform
programming_language = C/C++ and Java
genre = libraries for Unicode and Globalization
license = MIT License
website = http://www.icu-project.org/

International Components for Unicode (ICU) is an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization and software globalization. ICU is widely portable to many operating systems and environments. It gives applications the same results on all platforms and between C/C++ and Java software. The ICU project is an open source development project that is sponsored, supported and used by IBM and many other companies.

Some of the services that it provides are the following.

* Text: Unicode text handling, full character properties and character set conversions
* Analysis: Unicode regular expressions; full Unicode sets; character, word and line boundaries
* Comparison: Language sensitive collation and searching
* Transformations: normalization, upper/lowercase, script transliterations
* Locales: Comprehensive locale data and resource bundle architecture, via the Common Locale Data Repository
* Complex Text Layout: Arabic, Hebrew, Indic and Thai
* Time: Multi-calendar and time zone
* Formatting and Parsing: dates, times, numbers, currencies, messages and rule based

ICU provides much richer internationalization facilities than the standard libraries for C or C++, and most operating systems.

Origin and Development

ICU is descended from C++ frameworks produced by Taligent in the mid 1990s. Soon after Taligent became part of IBM in early 1996, Sun Microsystems decided that Java, then in its infancy, "was missing international support. Taligent had great international technology, talented engineers, and a location about 100 meters from Sun's JavaSoft division in Cupertino, California. IBM arranged for Taligent's Text and International group to contribute international classes to Sun's Java Development Kit."cite web |url=http://www.icu-project.org/docs/papers/history_of_java_internationalization.html | title=Getting Java ready for the world: A brief history of IBM and Sun's internationalization efforts |author=Laura Werner |year=1999] Some of the code for text processing, date formatting, etc., was rewritten in Java and became the JDK 1.1 internationalization APIs. A large portion of this code still exists in the Javadoc:SE|package=java.text|java/text and Javadoc:SE|package=java.util|java/util packages. Further internationalization features were added with each later release of Java.

IBM programmers then rewrote the Java internationalization classes in C++ and later ported some classes to C. The C++/C version of ICU is known as ICU4C. The ICU project also provides ICU4J ("ICU for Java"), which adds features not present in the standard Java libraries. ICU4C and ICU4J are kept as similar as possible (though not identical — for example, ICU4C includes a character converter API). Both have been enhanced over time to support new facilities and new features of Unicode. ICU was released as an open source project in 1999 under the name "IBM Classes for Unicode." It was later renamed to "International Components For Unicode."

See also

* Uniscribe
* OpenType
* Apple Advanced Typography
* Pango
* Graphite (SIL)

References

External links

* [http://www.icu-project.org/ ICU website]


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • International Components For Unicode — Pour les articles homonymes, voir ICU. L International Components for Unicode (ICU) est un projet open source qui fourni des bibliothèques pour les langages informatique C/C++ et Java pour prendre en charge Unicode, l internationalisation et la… …   Wikipédia en Français

  • International components for unicode — Pour les articles homonymes, voir ICU. L International Components for Unicode (ICU) est un projet open source qui fourni des bibliothèques pour les langages informatique C/C++ et Java pour prendre en charge Unicode, l internationalisation et la… …   Wikipédia en Français

  • International Components for Unicode — (ICU) ist ein Open Source Projekt zur Unicode Unterstützung und Internationalisierung, welches ausgereifte C/C++ und Java Bibliotheken bereitstellt. ICU ist portabel und auf vielen Plattformen einsetzbar. Programme, die ICU verwenden, produzieren …   Deutsch Wikipedia

  • International Components for Unicode — Pour les articles homonymes, voir ICU. International Components for Unicode (ICU) est un projet open source qui fournit des bibliothèques de traitement utilisables dans les langages informatiques C/C++ et Java, afin de prendre en charge les… …   Wikipédia en Français

  • Binary Ordered Compression for Unicode — Le BOCU 1 est un schéma de transformation du texte, compatible avec le répertoire universel d’Unicode et ISO/CEI 10646, en séquences d’octets. Il tire son nom de l’acronyme anglais de Binary Ordered Compression for Unicode (« compression… …   Wikipédia en Français

  • Binary Ordered Compression for Unicode — BOCU 1 is a MIME compatible Unicode compression scheme. BOCU stands for Binary Ordered Compression for Unicode. BOCU 1 combines the wide applicability of UTF 8 with the compactness of SCSU. This Unicode encoding is designed to be useful for… …   Wikipedia

  • Standard Compression Scheme for Unicode — The Standard Compression Scheme for Unicode (SCSU) [cite web |url=http://www.unicode.org/reports/tr6/ |title=UTS #6: Compression Scheme for Unicode |date=2005 05 06 |accessdate=2008 06 13 ] is a Unicode Technical Standard for reducing the number… …   Wikipedia

  • Binary Ordered Compression For Unicode — Unicode Jeux de caractères UCS (ISO/CEI 10646) ISO 646, ASCII ISO 8859 1 WGL4 UniHan Équivalences normalisées NFC (précomposée) NFD (décomposée) NFKC (compatibilité) NFKD (compatibilité) Propriétés et algorithmes …   Wikipédia en Français

  • Binary ordered compression for unicode — Unicode Jeux de caractères UCS (ISO/CEI 10646) ISO 646, ASCII ISO 8859 1 WGL4 UniHan Équivalences normalisées NFC (précomposée) NFD (décomposée) NFKC (compatibilité) NFKD (compatibilité) Propriétés et algorithmes …   Wikipédia en Français

  • Unicode — For the 1889 Universal Telegraphic Phrase book, see Commercial code (communications). The Unicode official logo since October 2009 …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”