Friday, October 05, 2007

Unicode Makes Multiple Languages Easier In Your eDeveloper V10 Applications

One of the important enhancements to eDeveloper V10 is the addition of full support for Unicode. This FAQ is designed to answer some of the high-level questions you may have regarding Unicode and its use in eDeveloper V10.

Q. What is Unicode?

A. The official definition can be found on the Unicode Consortium website in their glossary:

“Unicode. The universal character encoding, maintained by the Unicode Consortium ( This encoding standard provides the basis for processing, storage and interchange of text data in any language in all modern software and information technology protocols.”

Q. What are the benefits of Unicode?

A. Once again we turn to the Unicode Consortium for an answer:

“Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. The Unicode Standard has been adopted by such industry leaders as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, Unisys and many others. Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends.”

“Incorporating Unicode into client-server or multi-tiered applications and websites offers significant cost savings over the use of legacy character sets. Unicode enables a single software product or a single website to be targeted across multiple platforms, languages and countries without re-engineering. It allows data to be transported through many different systems without corruption.”

Q. How extensively does eDeveloper V10 support Unicode?

A. eDeveloper V10 now includes across-the-board Unicode support within the product. Support for Unicode is provided in addition to existing support for both the ANSI and OEM standards.

Q. What specific Unicode capabilities are included in eDeveloper V10?

A. Unicode support in eDeveloper V10 includes the ability to read from and write to Unicode database fields; capability to input and output Unicode data to various Input and Output files; options to create program logic for Unicode data; techniques to perform read and write operations on Unicode files; capacity to send and receive Unicode data to and from external systems; and power to use Unicode definitions in expressions, functions, and form properties.

Q. What capabilities are there for Unicode Conversion?

A. eDeveloper V10’s implicit casting mechanism lets you select the code pages you
want to use for conversion to and from Unicode. If you don’t select the code
pages, eDeveloper V10 will use its own default code page. For purposes of explicit casting, there are two new functions:
• UnicodeFromANSI
• UnicodeToANSI

Q. Is it true that Unicode functions have been added to eDeveloper V10?

A. Yes, to complement the Unicode support, two new Unicode-related functions have
been added.
• UnicodeCHR - Converts a numeric value to a corresponding Unicode
• UnicodeVal - Converts a Unicode character to a corresponding numeric

Q. Are there utilities for Unicode Conversion?

A. Absolutely. eDeveloper V10 includes utilities that help you convert data from Unicode to ANSI and from ANSI to Unicode. The utilities let you define the input and output files as well as the code page to use, if you don’t want to use the default code page.

Q. Are there resources on the web to help us learn more about Unicode?

A. Of course. Start, of course with at

While I am sure you can google on your own, here are a couple more interesting pages to get you started on a nice long surfing expedition:

Alan Wood’s Page
This is a good resource page with links to many other useful sites.

W3C Page
And of course, W3C could not possibly be silent on a subject as important as Unicode, here is the XML/Unicode page:

Q. What character sets are supported by Unicode?

A. “More than you will ever need” is probably the shortest answer.

Specifically, Unicode 2.0 supports scripts for Arabic, Armenian, Bengali, Bopomofo, Cyrillic, Devanagari (the script employed by Hindi and Sanskrit), Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Latin (including the international phonetic alphabet IPA), Lao, Malayalam, Oriya, Tamil, Telugu, Thai, and Tibetan scripts. These scripts are all written horizontally. Hebrew and Arabic are of course written right to left. Indic scripts are written variously and in ways that are sometimes described as a circular motion. Arabic and the Indic scripts must use intelligent ligature selection.

Unicode 3.0 expands some existing scripts and adds Braille, Canadian Aboriginal, Cherokee, Ethiopic, Khmer, Mongolian, Myanmar, Ogham, Runic, Sinhala, Syriac, Thaana, and Yi. Mongolian is the first script that can only be written in vertical rows.

Besides the characters for writing the world's major languages, there is a whole set of typographic, technical, graphical, mathematical, astrological and other scientific symbols and geometrical shapes in Unicode.

Q. I heard Unicode has limited support for Chinese, Japanese and Korean. Is that true?

A. Unicode supports Han scripts such as those used in Chinese, Japanese and Korean (often called CJK). Unfortunately some urban legends or myths have developed about the support of CJK characters in Unicode. According to, “The Unicode Standard supports all of the CJK characters from JIS X 0208, JIS X 0212, JIS X 0221, or JIS X 0213, for example, and many more. This is true no matter which encoding form of Unicode is used: UTF-8, UTF-16, or UTF-32.”

In fact, Unicode supports more than 70,000 CJK characters. More will undoubtedly be added, but you can be assured that the support is comprehensive already and goes above and beyond what is expected. reports that “the International Standard ISO/IEC 10646 and the Unicode Standard are completely synchronized in repertoire and content. And that means that Unicode has the same repertoire as GB 18030, since that also is synchronized with ISO 10646 — although with a different ordering and byte format.” Rather a mouthful for those that do not follow these things in detail. And in the final analysis, that is the beauty of eDeveloper 10 – as with all previous versions of Magic, you do not need to get bogged down in the underlying technical details. With eDeveloper 10, just sit back, relax and enjoy full Unicode support.