Effort estimation for a Globalization project
Effort estimation is the first step to undertaking any software project and a Globalization project is no different. Effort estimation for a product or application which needs to be Globalized follows more or less the same estimation principles as regular maintenance projects, yet there are no defined methods specifically for estimating the amount of I18N or L10N changes required. While working on the proposal for a Globalization project for one of our clients we were faced with the dilemma of adopting standard methodologies like SMC based estimation, FP based estimation etc or trying to create a hybrid and come up with our own estimation model which follows the same estimation principles but is more tailored for globalization projects. Finally we came up with a raw estimation model which was fine tuned over time and gave us estimates which were statistically inline with the results from other maintenance projects.
The first step to estimation is to understand the underlying product. Embarking on a project without complete information generally leads to disaster later. In the initial meetings with the client it is important to understand the current scope of the product. It will be useful to know the target geographies where the product is going to be sold, the current degree of internationalization if any, the platforms which need to be supported, the product architecture etc. Each requirement throws in more challenges in terms of estimation. The technical people involved in the estimation should have prior Globalization experience and understand the various I18N impact points in the code. They should be able to isolate code which needs I18N related changes with the rest of the code. Off course this is a very daunting task when the code base is huge, which is the typical scenario with a product; so we need tools and utilities which can find out all the impact points in the code. There are static analysis tools available which can do this to a certain degree. They can help in finding out the number of hard coded strings in the product, the number of non-Unicode API's and data types used etc and come out with reports which can be further analysed and used while estimation. At Infosys we use our in-house developed Internationalization tool which is rule based and helps in analysing code based on the specific set of rules that we set. This way the reports contain very relevant information which can be directly used in the estimation model.
At the time of estimation, it is important for the architect to decide on the encoding which will be supported by the product. This decision has a direct binding to the impact points in the code. In case the application has to support UTF-16, most of the API's and data types in a C++ application have to be replaced with their wide char equivalent, while if the application has to support UTF-8, only a handful of string related API's are impacted. The decision to use a particular encoding can prove to be very important since deciding to use a different encoding later at the implementation stage can prove to be very expensive and introduces risks in the quality and schedule of the project. Every encoding has its pros and cons and it must be well debated before going ahead with the decision. If there is database support in the product, the database layer should be analysed so that data that flows in and out of the database is in the required encoding. All internal and external interfaces of the application must be analysed so that the data flowing between modules or applications has an encoding which the communication layer can understand. The tools which help in estimation have a limited scope and the rest depends on the expertise of the person analysing the code and design documents.The software estimation process breaks down the requirements into sub requirements which are made as granular as possible. At a very granular level if we know the number of API's or data types we need to change, we can roughly estimate the effort required to make those changes. If we know the third party tools the application interfaces with, we can estimate the effort required to internationalize the external interfaces or upgrade the third party tools to their Unicode supported version. A simple requirement like Unicode support for the UI translates to creating resource files for all locales, getting the number of strings which need to be externalized into those resource files, creating a library for reading and writing to the resource files etc. In this way we estimate at the very granular levels always taking into account our past experiences while making similar changes and the organization wide PCB (Process Capability Baseline) metrics. This estimation model is based on the bottom-up approach where estimates at the very root level finally add up to give the total development effort. To this we add the usual project management and testing efforts and come up with a final estimate.
The key to the whole estimation process is understanding the product and coming up with an exhaustive list of I18N impact areas and breaking them down into measurable entities which can be analysed manually or using tools. Like any other estimation process, this may or may not be very accurate, but after applying this to several Globalization projects, the model gets more and more well defined and the estimates are much more accurate. I am sure there are other estimation models people have experimented with while estimating effort for Globalization projects. It will be interesting to discuss alternate models and understand the pros and cons of each.