Workload Modeling of SaaS based Multi-tenant Applications
One of the technical challenges of SaaS based Multi-tenant Application is to ensure that multi-tenant application addresses performance requirements of all the tenant accessing the application.
Major performance problems of Web Application including SaaS based Multi-tenant Web Application can only be corrected by recreating the production scenario in a controlled environment and arrive at the solution through performance testing and analysis. The key to this approach lies in accurately identifying parameters like Hits/second, response time per request, number of concurrent uses, think time etc. per tenant by Mining of Web Server access logs that can help recreate the production workload for load testing.
One of the key challenges is identifying tenant specific above mentioned parameters from centralized log files maintained by Web Server of Multi-tenant SaaS Application. With logfile analysis, information not normally collected by the web server can only be recorded by modifying the URL. As long as URL of the multi-tenant SaaS Application contains tenant identifier we can track tenant specific above mentioned parameters. But many SaaS Application providers also use session mechanism to track tenant instead of appending tenant identifier to each of the URL.
One of the approaches to address this challenge is to capture user name information logged by Web Server for each of the request and use User to Tenant mapping data used by SaaS Application to figure out tenant to whom the user belongs. Like this we can categorize each of the requests into group of tenant specific requests. Once we form group of request for each of the tenant, we can mined the data to arrive at tenant specific parameters like Hits/second, response time per request, number of concurrent uses, think time etc.
This approach will fail in scenarios where one user belongs to multiple tenants. Other approach is to use Page Tagging technique to obtain tenant specific parameters like Hits/second, response time per request, number of concurrent uses, think time etc. Are there any other approaches to address this challenge?