The Infosys Labs research blog tracks trends in technology with a focus on applied research in Information and Communication Technology (ICT)

« An Introduction : Markov Model | Main | Next Gen BI based on Semantic Technology »

Monitoring Load Generators during Performance Testing

Is monitoring of load generators required during performance testing? - This question is generally answered with a YES but in practice it is not followed. The significance of understanding load generation process is generally overlooked in performance testing. This blog discusses a case study where load generators were not able to generate the load as expected and how it was resolved.

We were working on a Proof of Concept (PoC) to study the performance of a sample application under consistent peak load. The load tests were executed for a single business transaction with multiple loads starting from 100 users. All the load tests included ramp up to let the application handle the gradual increase in the load. The load scenarios were created with minimal think time of one second and were executed for a shorter period of 15 minutes as the objective was to find the peak load which results in 80% CPU utilization.

The application was able to handle up to 300 users load but lot of errors were thrown when the user load reached about 330 users during the ramp-up of 500 users test. The test results log showed HTTP 500 Internal Server Errors and exceptions such as java.net.BindException, java.net.ConnectException and java.net.SocketException. Based on the server log analysis it was found that the HTTP 500 errors were caused due to wrong parameter values passed for some of the request parameters. The script was designed to capture these dynamic values from the response of the previous request and use it in the subsequent request as required. Later it was found that the first request itself failed and the second request was not updated with parameter values, which resulted in java.lang.NumberFormatException and HTTP 500 errors were returned for the second request. As the primary cause of failures were the requests that failed first, analysis to find out why the first request failed was done. The error responses of those requests were related to socket exceptions, but none of those socket exceptions were logged in the server logs. To understand things in more detail the load generator was monitored to find out how the socket connections are established from the testing tool to the server. This was done using the Microsoft Windows netstat command. The output of the netstat command showed too many socket connections to the server and the number was much higher than the number of users that were simulated using the testing tool. It was also observed that most of these sockets were in TIME_WAIT state.

After finding that the issue is based on how the connections are established from the load generator to the server, further investigation was required on how HTTP calls are being made. This lead to the analysis of the testing tool plug-in used to simulate HTTP requests and it was found that using the default plug-in causes this issue as the HTTP connections were not reused. This also clarified why so many sockets were in TIME_WAIT state. Based on the documentation of the tool the plug-in was replaced with another plug-in which supported reuse of connections. The problem did not get resolved until the script was updated to use the new plug-in. The test was executed again and the load was generated as expected without any socket issues.

This whole exercise made one thing very clear: close/detailed monitoring of load generators during performance testing (at least for basic system level metrics) should be considered as part of the performance testing process. This will help uncover load generation related issues during the  testing cycles. It will also help in ensuring that the load generator is able to generate the load as expected and reduce the time and effort spent in unnecessary application analysis.

Comments

Hi Ananda.

Nice article. Just recently I was working on a framework where I would be constantly monitoring my LGs. Have you been able to figure out or come to an understanding what should be an ideal level at which LG's should be operating so that they are not being load tested, but are working during load testing.

Hi Krishna,

Thanks for your comment. Sorry for the delayed reply.
Generally load generators are monitored for the basic system metrics such as CPU Utilization, Available Free Memory (RAM) and Network Bandwidth Utilization. In case there is extensive logging enabled for the virtual users then it is also a good idea to check Disk IO and free disk space. General threshold recommendation for load generators is 80% (maximum) for CPU utilization and at least 25% of total RAM memory available (free memory). This will also depend upon the nature of the application under test and the testing tool used.

Hi -
In Infosys, do we have any inhouse load generating tools similar to LoadRunner, etc.
If not, what is the tool of choice of the performance engineering team at SETLAB's.

Rgds,
Rohit

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on