Infosys delivers concept-to-market software engineering services across the engineering value chain. Our blog will discuss the latest trends in software product engineering, outsourcing, technologies, and address business challenges.

« September 2010 | Main | December 2010 »

November 23, 2010

Internationalization and its dimensions in Product space

Enterprise application development is quite different from product development. An enterprise application is generally developed for a set of targeted users, organizations and region(s). The platform on which the application has to be deployed is also predefined in most of the cases. The number of deployments/installations of the application is also limited. However, that is not the case with a product. A product is targeted for a much wider range of users, organizations, region(s), deployment platforms etc. Also, the number of deployments/installations of a product are much higher than that of an application. Besides, a product in general, has to be much more configurable, customizable and scalable. A successful product should be designed to be able to adapt easily to varying environments and markets. Considering that products today are being developed targeting a wide range of markets, product internationalization is a key consideration for wide spread acceptability. When it comes to Internationalizing a product, it should remain configurable, customizable and scalable.

 Following are some of the aspects from a Product internationalization perspective:

1. Platform Independence: An internationalized software product should be able to operate in any locale on any operating system (Windows, Linux, Solaris etc) and should deliver consistent results. Also, the product should be able to work in a locale which is different from the Operating system locale. For instance, a product should be able to work in a Japanese locale on a Windows English OS, or on a Linux English OS.

2. Multiple Locales on same installation: An internationalized product should be able to support multiple locales in a single installation. Moreover, the user should have the choice to select the locale of his choice and the product should adapt to the new locale seamlessly at runtime. For example websites like www.hp.com, www.google.com, provide options to the user to switch to a different locale (country + language combination) at runtime.

3. Different timezones: An internationalized product should be able to deal with different timezones in a consistent manner. It should be able to store and retrieve the time in multiple timezones and display the correct time as per the selected locale. For instance an application might have a feature to send alerts about an event to various users located across the globe. The user might be able to view the time of the event's occurrence in their respective time-zones. The application should be able to smartly handle this. Microsoft Outlook is a live example for this feature, where meeting requests, mail etc all display the time to different recipients in their respective time-zones.

4. Help and Documentation: The product should be able to display the Help documentation as per the language in the selected user locale.
Lets consider a situation where a product has been released in a new market, with additional locale support. As the time to market was important, the entire help documentation might not get translated at the time of release. For such cases the product should be designed in manner that if some documentation or help files are not available in the user locale then the product should seamlessly shift to the default locale. In such a situation displaying help pages in default language would be a better option then showing broken links or error pages. Just like resource bundles, for resources like help documentation also product should shift to default locale if locale specific resource is missing.

5. Images/Audios/Videos: Some products may make use of images, audios clips, or video clips which may contain information (visible text or audible data). For instance, a website may use images to display different buttons, links, or use video files which contain text in a specific language (English for instance).  For such a product to be internationalized and operate successfully in a different locale (in the Japanese locale for instance), the images with English text should change to display images with same text in Japanese, videos with English text should change to videos displaying the same message in Japanese.  The product should be capable enough to pick the correct image/audio/video clips based on the user locale. Also as discussed for help files, product should shift to the default locale seamlessly if locale specific image/audio/video resource is missing.

6. Installers: Deployments of enterprise applications are generally limited, and are done in a planned manner by skilled people. However, products are expected to support multiple deployments and a varied set of environmental conditions. Hence, products generally come packaged in an installer. As a result, it is sensible that these installers are also internationalized. I18n support to installers would again have different dimensions:

a. Internationalized Installer: Installer application should be internationalized as any other software. Installer should display user interface as per user locale and should support core I18n requirements like resource bundle, encoding support, data formatting etc.


b. Ability for Locale specific installations/configurations: Another aspect is that the installer should be able to install the internationalized product with specific localized features. For example, consider a product being developed for the international market. It is possible that the product may have been well internationalized and built to support a large number of locales. In a case where an end user wants to be selective in installing components corresponding to only those locales that he is going to work on, the installer should be able to provide the user the ability to select the locales of his choice and customize features as per the selected locale. As an example, if the user selects German and French locales, the installer should extract only those resource bundles/documentation/images corresponding to the German and French languages.
  

7. Bi-directional Support: Product should also support bi-directional languages. Most of the languages across the globe are written left to right (LTR), while there are some scripts which are written from right to left (RTL) (Urdu, Arabic, Hebrew for example). These scripts can also have sections of LTR scripts (like some English words) or numbers embedded. The product should be able to handle such languages. Such languages impact the layout of the UI, the direction of cursor movement, the order of labels and inputs fields etc.  An internationalized product in deployment should be able to handle language bi-directionality without needing any engineering changes.

8. Regional Business Rules & Regulations: Different countries have different rules and regulations. Differing rules could be around the storage/retrieval of information, tax laws, financial transactions etc. The product should be able to adapt to such rules based on locale changes. This is complex as the magnitude of such variations in business rules could be huge. An Internationalized product should provide a framework such that new rules can be added, or existing rules can be changed easily for different locales without impacting the design of the product. Based on the product design, the rules could be even be configurable in files, databases etc. .Sometimes,  it becomes necessary to write custom code for a particular locale as part of Localization. Such custom code should be easily pluggable in an internationalized product.

9. Encoding Support: Unicode (UTF-8, UTF-16 or UTF-32) encoding supports almost all the scripts that are used across the globe. In addition, there are a number of native language encoding schemes that have been in existence from times before Unicode. For instance, SHIFT-JIS, EUC are two different encodings used for Japanese text. Not every operating system supports these native encodings and that brings in the implementation challenges. The product should be able to adapt to such encoding changes based on the underlying platform support, without impacting the functionality.


10. Ease of adding new Language: Product design should be such that support for any new language can be added without any code/design changes and recompilation.

Many of us consider 'Internationalization' to be as simple as displaying text in different languages - but as indicated in the points above there are  different dimensions that need to be evaluated before starting the I18n work for any product. Not all the dimensions discussed above are mandatory for every software application/product. The applicability of each dimension depends upon the type of application, its usage, deployment landscape and the target market requirement. Businesses should select features to be implemented based on the cost of implementation and the ROI that can be achieved in target markets.

 

November 14, 2010

Software service in the Japanese market

The Japanese market remains an important target in localization schedules of internationalized products. Despite China replacing Japan as the world's second largest economy, the number of requests to support 'Japanese localization' in internationalized software products as a priority does not see any immediate decline. Understanding Japanese encoding schemes requires a great deal of effort - especially for someone who has been spoilt by the simple elegance of ASCII encoded text. Though Unicode is THE way ahead, one must understand that there are still thousands of legacy products out in the domestic market - a complete rewrite of which is not a viable business option in the current economic climate in Japan. As more and more important Japanese businesses outsource legacy software maintenance /enhancements to service providers, quality handling of such software will require a decent understanding of Japanese text representation and encoding schemes.

The Japanese language has mainly three writing systems - hiragana, katakana and kanji. While kanjis are pictographical scripts borrowed from the Chinese alphabet, hiragana and katakana are 'alphabetical' characters representing syllables. While hiragana is used in words representing words of Japanese origin (you would use hiragana to write "nihon" which means "Japan"), katakana is used to represent words of foreign origin (you would use katakana to write "tabako" which means cigarette but derived from the word "tobacco").Kanjis are generally the pictographical representations and are very commonly used in Japan. Thus, the Japanese character set would ideally mean - all of the hiragana, katakana and kanji characters used in the Japanese writing system.

The English text (character set containing alphabets, punctuations etc) in computers, communications equipment and other devices that use text - has been represented by the ASCII character-encoding scheme. Similarly, all Japanese text (used on computerized interfaces) has been represented in the JIS (Japanese Industrial Standard) character set as per the standards defined by the Japanese Standard Association before the advent of Unicode.
Interestingly, the JIS character set is actually a combination of several standard character sets - JIS X 0201(deals with roman characters and half width katakana), JIS X 0208 (full width katakana, hiragana, punctuation and a number of kanji characters), JIS X 0210 (Rare kanji, non-English European characters etc), JIS X 0213 (a new encoding scheme introduced in the beginning of this decade).

There are essentially three JIS encoding schemes to represent the JIS character set - Shift JIS, EUC, and ISO-2022-JP. From a software point of view working on Japanese localized systems, one should be more concerned about the Shift JIS and EUC encoding schemes. Shift JIS is a common encoding of JIS on Windows platforms, while EUC is a common encoding standard on UNIX systems. However, you will find support for Shift-JIS on UNIX too (PCK on Solaris for example).CP-932 is a Microsoft's extension of Shift JIS to include some NEC special characters and IBM extensions.

The rules of parsing text in the language vary across the encoding schemes being used. As almost all Japanese characters are represented using multiple bytes, the rules for determination of what is a single byte character, what is a lead-byte and what is a trailing byte is different based on the encoding of the text.  A more viable option in today's Unicode world would be to convert received text to a defined Unicode encoding and then use popularly available Unicode libraries to perform necessary string processing, before converting back to the original native encoding for display/third party interfacing. 

If you have been working on maintaining legacy software originating from Japanese companies, you would surely require to know about the various encoding schemes that are used to represent Japanese text. A decent knowledge of a few Japanese words, and how to enter them as Hiragana, Katakana and Kanji - could help in performing some basic tests to validate correctness of source code changes in the application. In addition, the ability to generate the text (using editors like Hidemaru for example) as per a defined encoding would be extremely helpful in testing software for the Japanese market.

 

Subscribe to this blog's feed

Follow us on

Blogger Profiles

Infosys on Twitter