Infosys Microsoft Alliance and Solutions blog

« ASP.NET 4 Social Networking | Main | State Machines are back in business!!! »

Rainbow in Cloud!!

As Cloud Computing starts becoming pervasive, more and more application patterns are considered or evaluated as fitment for running on Cloud.

Feasibility, and readiness for doing Business Intelligence (BI) on Cloud becomes one of the key asks.This is not simple and needs to be evaluated from multiple perspectives such as business, technology to the least. Business perspective would include aspects such as functionality required, cost-benefit, and elasticity of the environment, to name a few. Technology aspects would include readiness of stack to support functionality on Cloud.

BI brings out insightful information in the form of nice colorful dashboards, reports (read rainbow) that can be sliced/diced/drilled, suggest patterns for user to make informed business decisions. As such BI is a very broad topic and hence it is important to take a scenario specific approach to explore applicability in Cloud and recommend solutions.

As a potential customer willing to explore analytics/BI on Cloud, I could possibly seek solutions around some of the following scenarios, recommended solutions here are based on MS stack (on premise as SQL Server stack + off premise as Windows Azure stack)

1) Leverage Cloud for augmenting to my existing off premise BI capabilities to reduce ETL and cube processing window (Cloud bursts)

In this solution, existing BI investments running on premise is augmented with cloud computing capability by provisioning additional nodes on Windows Azure Cloud.
The pre requisite for this is on premise servers must be running on Windows Server 2008 R2 with HPC pack.

This is a typical Cloud burst type of scenario and can be most useful to reduce latency window for jobs such as ETL running using SQL Server Integration Services. Cube processing/mining in SQL Server Analysis Services is another thing for which computing nodes can be borrowed from cloud and released once done.

This approach can help in reducing the existing ETL or cube processing windows in turn making the vision of real time BI "real". It is also useful to cater to any elastic demands during peak load times such as Christmas sales.

Pros
• Is non-invasive solution and leverages existing investments
• Good for customers who are not willing to move to off premise solutions because of concerns around data security, compliance, vendor lock-in, etc.

Cons
• Tied to investment in on premise and cannot truly exploit other benefits of cloud such as cost advantages, software as a service, etc.

Moving on to next scenario,

2) My business is gathering data for each user as one large file...this could be user specific browsing/navigation history and I want to do analytics on cloud using MS Cloud stack

Such scenarios can be dealt by dumping user specific files in to Windows Azure Table/Blob storage, once done then Worker role can be used to perform certain analytics/intelligence/mining using predefined algorithms on Table Storage, with the intent of extracting and dumping results in to again Table Storage or SQL Azure in the form of areas that we want do reporting on such as max. sites visited/user,  number of hits to a specific site, max. time spent on specific site, number. of web sensed site visited, etc. as KPIs and then reporting them using SQL Azure reporting.

Pros
• Complete solution on Cloud leverages inherent Cloud advantages such as no upfront capex., elasticity, scalability, accessibility of application across devices and application boundaries

Cons
• BI features limited by what is available with Microsoft Windows Azure stack which currently is not as rich as on-premise SQL Server stack for BI.

Moving on to next scenario,

3) I have a small to medium size off premise LoB application which I would like to move to Cloud and also have basic reporting/business intelligence capability built around it

Table Storage, SQL Azure, Web role, Worker Role, and SQL Azure Reporting stack can be used to do very basic reporting and Dash boarding using Silverlight/PowerPivot.

In absence of analytical structures like Cubes, the aggregated/predefined summary tables will have to be created in SQL Azure based on identified query patterns and then SQL Azure Reporting needs to be published on top of it. Any trend analysis, pattern matching kind of work can be done using Worker role processing. As of date, there are no OoB data mining /trending algorithms in Windows Azure Cloud offering but 3rd party algorithms can be evaluated to see the integration or possibly this work can be offloaded to on premise engine and results reported back to the Cloud.

Pros and Cons of this approach would be same as mentioned in 3)

4) For large Enterprise wide implementation, moving complete BI on Windows Azure

As of date it is nearly impossible to do a pure play Microsoft Cloud solution for enterprise wide BI implementation using existing Microsoft BI Cloud stack as currently it lacks components such as integration Services, analysis services, data mining, SQL Services rich reporting, etc. In such scenario, it is best to leverage mix of on premise and off premise solutions and provide a Hybrid solution.

In a Hybrid solution, Data integration and analytics can be kept off premise and implemented using existing SQL Server Integration Services, SQL Server Analysis Services. Reporting can be done on cloud by connecting to on-premise SQL Server based Data warehouse/Data mart.

The reporting on Cloud will be as good as SQL Azure reporting, ASP.NET, Silverlight, Power Pivot richness. Performance Point Server is part of SharePoint stack, needs to be evaluated if it can be used to deploy on Web role (IIS). In a nutshell, data integration and processing off-premise and presentation on Cloud is the solution.

As such I don't see a strong fitment for complete advanced Enterprise BI on Cloud with existing MS Cloud stack due to lack of technology readiness at this point of time. However basic reporting/dash boarding/analytics can be done on Cloud.

Pros
• Hybrid pattern provides the option of marrying best of both the worlds (SQL Server and Windows Azure) by providing rich BI capabilities at the same time reaping the benefits of scale, elasticity of cloud.

Cons
• May need to manage both on premise applications and infrastructure and off premise applications

There can possibly be more variations, or permutation combination, however with the currently readiness of Microsoft Windows Azure stack, these patterns seems to be a place to start. Your thoughts/views are welcome!!

Comments

Good points. However for your first scenario, your Pros state "Good for customers who are not willing to move to off premise solutions because of concerns around data security, compliance, vendor lock-in, etc"

How can someone utilize cloud for cube processing without moving data to it and how will that work if there are concerns around data security? Or are we saving that we have some data masking solution sitting in between?

For this you need to leverage Azure VM role, you need to build a VHD with SQL Server 2008 SSIS, SSAS (only software components, no need of data) and save the VHD in Azure Cloud. Once this is done, use this VHD to deploy Azure VM nodes in cloud.
With Microsoft Windows HPC Server 2008 R2 SP2, you can add these nodes as part of off-premise cluster for Azure burst. However note that SP2 is currently in beta. The detailed steps are listed here
http://technet.microsoft.com/en-us/library/hh184325(WS.10).aspx

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Blogger Profiles

Infosys on Twitter