Infosys delivers concept-to-market software engineering services across the engineering value chain. Our blog will discuss the latest trends in software product engineering, outsourcing, technologies, and address business challenges.

« September 2009 | Main | December 2009 »

November 25, 2009

Internationalization and the development life cycle

As a product company your team has come up with a brilliant concept which has tremendous marketing potential in your country. Your marketing survey shows that the concept will soon catch up with other countries across the globe and you can capture the overseas market too. The only catch is that the product will be required to be globalized before it is launched in the international markets. A global launch is still 5-6 months away, so what product strategy will you adopt? Develop the product in English and later when you have access to the global markets, think of internationalizing it or start developing an internationalized version of the product right from conceptualization stage so that you are ready to penetrate the global markets when the time comes? This is a question most product managers will face while developing a product.

The ideal time to start thinking about Internationalization or Localization of your product is at the conceptual stage of the product development life cycle. The product management has to be clear about their vision for the product. If the product is meant for international audiences, it is a good idea to plan for it earlier than later since it will definitely be more expensive to do the same thing later and you may also lose out on market share due to a delayed launch. Internationalization is often more important than localization in the development stage since localization will normally not involve code re-engineering whenever it is done. Think about a product which is not internationalized and you want to introduce multilingual support to it. Changes will have to be made in multiple layers in order to achieve this. These changes will be costly in terms of the time taken, bugs introduced and additional testing required. However if the product architecture had made the same provision at design time, things would have been much simpler and all the development team had to do was get the string resources translated in order to support a new language.

What does it take to internationalize your product right from requirements to design to development and finally testing? The requirements gathering team must understand the typical i18n requirements and they must evaluate the product requirements from an i18n perspective too. The design team must understand the typical i18n aspects and ensure that the product design takes care of all i18n issues along with the intended architecture and design. The development team should be experienced with making i18n related code changes and it is important that they understand the i18n best practices in order to avoid rework later. The development of the core components and internationalization must go hand in hand. Pseudo-localization testing must be planned for the product to identify potential localization issues. If these practices are followed, it is more or less assured that it will be easy to adapt the product to different regions or countries as and when required with minimum cost and time to market.

November 24, 2009

Trading off with Design Patterns

Over the last decade or so, any queries as regards designing object oriented software systems would lead to one being advised to read the Go4's book on Design Patterns (Design Patterns: Elements of Reusable Object-Oriented Software). It is without doubt a wonderfully written book and should be in the possession of most software designers involved in the world of object oriented design. But, what happens when an over-enthusiastic reader ends up seeing patterns in every software problem he encounters ?

I had recently come across the a wonderfully hilarious but true write up on when software designers and architects go overboard (http://www.joelonsoftware.com/articles/fog0000000018.html) trying hard to resolve a future problem which may or may not exist - thus compromising on what needs to be resolved NOW. I have been a victim of the same phenomenon - the obsession with "what if tomorrow .....". No, this is not a criticism of the need to look ahead - but it is just to drive home the point that it is important that one does not lose focus on what needs to be solved right now.

The same applies to the choice of design patterns. Over-enthusiastic designers end up creating a problem to implement a design pattern which they may find savy and also as a result of the "what if tomorrow ...." phenomenon. A manifest of this was a case where a product was redesigned to enhance maintainability. The end result was beautifully maintainable code but seemingly not performant enough to be launched in the market. The designers had looked far ahead to the problems of tomorrow to actually think about it's current deployment requirement.It is thus extremely important to 'trade off' between what a customer wants today as an immediate market requirement as against what he is willing to compromise on at a future time.

One needs to be extremely conscious when trying to fit a design solution or pattern to resolve a problem. The  Singleton pattern is a very commonly used pattern - and admittedly one of the easiest to understand. You will not have to look too hard into the problem that you are trying to solve to find reasons to implement the Singleton pattern and lot of enthusiastic new designers will jump at the prospect of a chance to implement that chapter in the Go4 book. But, you have got to be sure that there is no thread-safety requirement that the customer has passed on without anyone realizing - which might end up making the 'Singleton' choice look amateurish.

Bruce Powel Douglass spoke at a session at the Infosys campus in Bangalore in July this year where he stressed on how architects and designers need to make a thorough analysis of the choice of design patterns and the best fit necessitated in the system that was under built. He called it 'the selection of patterns using design trade-off analysis'where he explained how the choice of design patterns needed to be weighed against the design criteria that the product was expected to achieve. For example, it was important to weigh the following typical design criteria against the needs of the product.

1. Worst case performance
2. Time to market
3. Memory
4. Reusability
5. Simplicity
6. Safety and Reliability

Now, of a choice of multiple design patterns, each design pattern needed to be rated as against how much it would help achieve each of the above design criteria to help provide some direction on the final choices. A nice elucidation of a structured approach to correct choices in design.

Though Bruce with reference to the world of embedded systems, this approach holds true no matter what system you intend to design. In today's world of instant solutions, doing this might seem slightly unwieldly but it is no doubt absolutely worth the effort.

So, go ahead and master the design patterns as espoused in so many books available, but remember to use it only as a tool to solve a problem that exists. Use it to solve a potential future problem when you are sure that it is not going to compromise the current solution.

November 11, 2009

Effort estimation for a Globalization project

Effort estimation is the first step to undertaking any software project and a Globalization project is no different. Effort estimation for a product or application which needs to be Globalized follows more or less the same estimation principles as regular maintenance projects, yet there are no defined methods specifically for estimating the amount of I18N or L10N changes required. While working on the proposal for a Globalization project for one of our clients we were faced with the dilemma of adopting standard methodologies like SMC based estimation, FP based estimation etc or trying to create a hybrid and come up with our own estimation model which follows the same estimation principles but is more tailored for globalization projects. Finally we came up with a raw estimation model which was fine tuned over time and gave us estimates which were statistically inline with the results from other maintenance projects.

The first step to estimation is to understand the underlying product. Embarking on a project without complete information generally leads to disaster later. In the initial meetings with the client it is important to understand the current scope of the product. It will be useful to know the target geographies where the product is going to be sold, the current degree of internationalization if any, the platforms which need to be supported, the product architecture etc. Each requirement throws in more challenges in terms of estimation. The technical people involved in the estimation should have prior Globalization experience and understand the various I18N impact points in the code. They should be able to isolate code which needs I18N related changes with the rest of the code. Off course this is a very daunting task when the code base is huge, which is the typical scenario with a product; so we need tools and utilities which can find out all the impact points in the code. There are static analysis tools available which can do this to a certain degree. They can help in finding out the number of hard coded strings in the product, the number of non-Unicode API's and data types used etc and come out with reports which can be further analysed and used while estimation. At Infosys we use our in-house developed Internationalization tool which is rule based and helps in analysing code based on the specific set of rules that we set. This way the reports contain very relevant information which can be directly used in the estimation model.

At the time of estimation, it is important for the architect to decide on the encoding which will be supported by the product. This decision has a direct binding to the impact points in the code. In case the application has to support UTF-16, most of the API's and data types in a C++ application have to be replaced with their wide char equivalent, while if the application has to support UTF-8, only a handful of string related API's are impacted. The decision to use a particular encoding can prove to be very important since deciding to use a different encoding later at the implementation stage can prove to be very expensive and introduces risks in the quality and schedule of the project. Every encoding has its pros and cons and it must be well debated before going ahead with the decision. If there is database support in the product, the database layer should be analysed so that data that flows in and out of the database is in the required encoding. All internal and external interfaces of the application must be analysed so that the data flowing between modules or applications has an encoding which the communication layer can understand. The tools which help in estimation have a limited scope and the rest depends on the expertise of the person analysing the code and design documents.

The software estimation process breaks down the requirements into sub requirements which are made as granular as possible. At a very granular level if we know the number of API's or data types we need to change, we can roughly estimate the effort required to make those changes. If we know the third party tools the application interfaces with, we can estimate the effort required to internationalize the external interfaces or upgrade the third party tools to their Unicode supported version. A simple requirement like Unicode support for the UI translates to creating resource files for all locales, getting the number of strings which need to be externalized into those resource files, creating a library for reading and writing to the resource files etc. In this way we estimate at the very granular levels always taking into account our past experiences while making similar changes and the organization wide PCB (Process Capability Baseline) metrics. This estimation model is based on the bottom-up approach where estimates at the very root level finally add up to give the total development effort. To this we add the usual project management and testing efforts and come up with a final estimate.

The key to the whole estimation process is understanding the product and coming up with an exhaustive list of I18N impact areas and breaking them down into measurable entities which can be analysed manually or using tools. Like any other estimation process, this may or may not be very accurate, but after applying this to several Globalization projects, the model gets more and more well defined and the estimates are much more accurate. I am sure there are other estimation models people have experimented with while estimating effort for Globalization projects. It will be interesting to discuss alternate models and understand the pros and cons of each.

November 10, 2009

Embrace Parallelism with Virtual Machines

Parallelism has until recently been a term associated with the world of high performance computing. Though humans have been endowed naturally with the ability to 'parallelize' worldly activities (one dangerous manifestation of which is the tendency to talk on the cell while driving your car), designing systems to embrace parallelism has always required that extra bit of mental effort.

Software architects and designers have been spoilt by the scale up in processor speeds over the years thanks to Moore's Law. But, that's all changing at a very fast pace today. Today's hardware designs (multicore based processors) require software to be designed with parallelism built into the system for performance scaling over the next generation hardware (with the promise of multiplying cores) . Herb Sutter has very beautifully explained the reasons for the need for change in software design strategy in his famous article "The Free Lunch is Over" - published sometime in 2005.

Let us consider software system that is running on a given server on modern multicore hardware. The system is considered to be well scaled if it is able to spawn enough threads to keep all the available cores busy and hence speeds up execution (for now, let's ignore specifics like overhead of thread management as against the benefit of parallelized execution etc.). Is there a limit to the ability to scale here ? Yes and it is the number of cores that are available on the server. Ideally, you would want to be able to dynamically make more and more cores available to be able to consume the tasks that are parallelizable but waiting for a core to be freed - but then you are limited by the number of cores available.

Moving the above problem to a wider canvas on a larger scale (the cloud computing environment for example), let's understand that software that can scale across multiple servers exhibits parallelism in it's own way. In this context, the processing element is a 'server' as against a 'core'. Ofcourse, making more and more server machines available to feed the software's hunger for parallelism is easier said that done (considering the high costs of server procurement, server management overhead, power management etc.) - unless you have considered virtualization.

Virtualization solutions to such problems would involve having the ability to to distribute incoming requests across a number of virtual machines - instead of physical servers. Such solutions have the ability to generate additional processing elements (virtual machines) based on increasing workload so that all available parallelizable tasks are being catered to. Besides, with reduced workloads, unused virtual machines can be made dormant thus saving on power costs etc . Notice, the ability to truly scale efficiently is not hindered by processor configurations (as seen in the standalone software system illustration above). Considerations along these lines are probably important considerations in architecting cloud computing solutions.

The concept of a 'processing element' in parallel programming patterns- which was usually a processing core - can now be extended to include virtual machines too . With that, it becomes easier to relate common parallel design patterns like Task Decomposition, Data Decomposition etc. to solving large problems using virtual machines. Embracing parallelism with virtual machines is a reality today.


 

Subscribe to this blog's feed

Infosys on Twitter