The commoditization of technology has reached its pinnacle with the advent of the recent paradigm of Cloud Computing. Infosys Cloud Computing blog is a platform to exchange thoughts, ideas and opinions with Infosys experts on Cloud Computing

« Microsoft can extend its Windows run on the Cloud | Main | When will PAAS mature? »

Cloud and Big Data: Is there a Relation between the two

As Rodney Brown mentioned in his latest post "Nonprofit May Help Ease Big-Data Talent Shortage" that Big Data is one of the most hyped and least known areas of the Cloud Ecosystem.  Big Data and Cloud, isn't these 2 different areas all together? How do they really come together?

Big Data is all about extracting VALUE  out of "Variety, Velocity and Volume" (3V) from the Information Assets available, while Cloud focuses on On-Demand, Elastic, Scalable, Pay-Per use Self Service models. The question often asked is then what is the relationship between Cloud and Big Data. Why are these two entirely different areas discussed together?

Anyone who has used "Elastic Map Reduce" on Amazon would immediately appreciate one of the most evident aspect of this relationship. Big Data need large on-demand compute power and distributed storage to crunch the 3V data problem and Cloud seamlessly provides this elastic on-demand compute required for the same. With the "Apache Hadoop", the de-facto standard for Big Data processing, the big data processing has been more batch oriented in the current state. The burst workload nature of the Big Data Computing Infrastructure makes it a true case for the Cloud. Amazon "Elastic Map Reduce" demonstrates how Big Data processing can be done leveraging the power of Cloud Elastic Computes. But is that the only part of this relationship between Cloud and Big Data?

On a deeper look, the other patterns of this relationship emerge. 

Cloud has glorified the "As-a-Service" Model by hiding the complexity and challenges involving in building a scalable elastic self-service application. The same is the requirement for Big Data Processing. Hadoop in a similar way hides the complexity of the large scale distributed processing from the end user perspective. The user write "Map-Reduce" programs or familiar known constructs with "Hive" or "Pig" and are able to seamless do the big data crunching without worrying about the complexity of node failures, linear scalability, replication, fault-tolerance elasticity etc., where Hadoop silently provides the large scale distributed capabilities behind the scene.  Thus the simplification provided by Cloud and Big data is the prime reason for the mass adoption of Big Data and Cloud. Amazon has just demonstrated how this simplification provided by the combination of Cloud and Big Data can increase the adoption of a seemingly complex problem of large scale distributed processing. The key here is SIMPLIFICATION!!

Both Cloud and Big Data is about delivering value to enterprise by lowering the cost of ownership. Cloud brings this through the Pay-per user model turning CAPEX to OPEX while Apache open source has brought down the licensing cost of such a sophisticated solution ideally which would have cost millions to build and buy. Both Big Data and Cloud has been driving the cost down for the enterprise and bringing VALUE to enterprise. We have witnessed the early adopters of the Big Data moving away from the Traditional Licensing Models to a more open-sourced model and thus lowering the overall Cost per Terabyte (TB) processing. Both Cloud and Big Data delivers value and the key is how agile the enterprises get to break the hurdles of enterprise open source adoption and jump into the Big Data Journey. 

Cloud and Big Data brings in data security and privacy concerns. This is where System Integrators has been building solutions that marry Cloud and Big Data within the Enterprise to build Elastic Scalable Private Cloud Solution to bring in the same value which enterprises can leverage to bring a Scalable Distributed Processing in action within the enterprise. Again we could see the similarity between Cloud and Big Data with respect to Security Concerns and how innovative solutions could drive these adoptions within the enterprise.

"Analytics as a Service" is the buzz these days and providing on-demand Analytics to enterprise is the wave to follow. We explore this in aspect of the relationship between Cloud and BigData in our next post. Till then stay tuned.!!


In One line: Big data is getting all kind of Data and cloud computing is , what you are giving to end user out of these collected data.
For Example, there may be different kind of sensor data.The company dealing with retail may not need a sensor data collected from environment and vice versa. And, as a cloud provider, I have all the sensor data; however, based on my end user/customer I'll give sensor data related to retail or I'll give sensor data related to environment.Please let me know your thoughts...

more demand is for bigdata or cloud computing

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

+1 and Like Infosys Cloud

Follow us on

Reimagining the Future of IT Infrastructure

Infosys on Twitter