Welcome to the world of Infosys Engineering! It is a half a billion plus organization that takes pride in shaping our engineering aspirations and dreams and bringing them to fruition. We provide engineering services and solutions across the lifecycle of our clients’ offerings, ranging from product ideation to realization and sustenance, that caters to a cross-section of industries - aerospace, automotive, medical devices, retail, telecommunications, hi tech, financial services, energy and utilities just to name a few major ones.

« March 2012 | Main | May 2012 »

April 12, 2012


nosql1.jpgOver the last few years while the world was waking up to the Cloud Computing and Web 2.0 phenomenon, we started hearing buzz words like 'Big Data', 'NoSQL'. With the evolution of Internet companies like Google, Facebook, Twitter, Amazon, the amount of data being stored on servers began rising exponentially and thus increased the complexity of database queries, caching and storage which traditional RDBMS solutions could not cope with. The focus of these companies was on performance, scalability and real time data. The data layer which was often the most neglected layer was gaining importance. NoSQL based databases were emerging as alternatives to traditional RDBMS solutions in situations where performance, scalability and availability were more important than data consistency and transaction support. Though NoSQL solutions have solved issues with performance, scalability and huge amounts of data, should they be seen as alternatives to RDBMS or are they two sides of the same coin?

According to Gartner, Big Data, Patterns and Analytics will be one of the top 10 strategic technologies of 2012. Unstructured data will grow some 80% over the course of the next five years, creating a huge IT challenge. The growth in data is expected to be exponential over the foreseeable future. The cause of growth can be attributed in part to Cloud computing and virtualization technologies and also to the emergence of the Social Media. Unstructured data like text, books, journals, documents, audio, video etc will only grow in future and an evolution in database technologies is required to cater to the demands of the software solutions of the future.

Since the past 2-3 decades the backend of any enterprise solution has been a Relational database like Oracle, MSSQL and DB2 etc. All RDBMS typically follow the same architecture, have competitive features and offer a consistent SQL interface due to which the choice of an RDBMS was largely driven by the platform or licensing requirements. While an RDBMS is able to handle large amounts of data and offer distributed capabilities, disruptive technologies like Cloud Computing and Social Media have placed huge demands on the performance, scalability and availability aspects of databases which RDBMS solutions are finding difficult to handle with the existing architecture models. The ACID model of RDBMS which is perfect for the transactions is also a cause of performance issues when dealing with huge data. Apart from this RDBMS also suffers from higher latency during failures in distributed environments. Changing requirements like high performance, scalability and availability have forced architectural changes in the database landscape and led to the emergence of the non-relational databases or NoSQL.

NoSQL databases do not have a formal schema associated with them and do not use SQL as the query language. These databases have the capability to scale horizontally easily by simply adding new nodes. The volume of Big Data that NoSQL systems can handle, outstrips what can be handled by the biggest RDBMS. They have relatively simpler data models leading to simpler administration and tuning requirements. NoSQL databases do not rely on expensive servers to scale; they can instead scale easily by adding new clusters of cheap servers. NoSQL databases give priority to performance, scalability and real time availability of data. These databases have poor support for analytics and business intelligence. They also do not generally support atomicity, consistency and in most cases proper durability. Thus for applications that require absolute transactional integrity and serialization, SQL databases are still the top choice.

nosql.jpgDepending on the data storage mechanisms NoSQL databases are categorized into key-value stores, document stores, graph databases, big table implementations etc.  There are a number of NoSQL solutions in the market like MongoDB, CouchDB, HBase, Cassandra, Redis, Riak, Neo4j, Voldermort, Hadoop etc. NoSQL databases may vary based on the storage models - MongoDB stores documents in BSON format, Redis, Voldermort are based on simple key-value stores, Riak is a document database, Cassandra and HBase are column oriented etc.

So are NoSQL databases really going to replace RDBMS? That's unlikely, considering that the choice between the two depends on the use case. Applications which depend on transaction support (Banking, Airlines etc) will continue to work with RDBMS while Social Media applications which mostly deal with unstructured data will look at alternative NoSQL solutions. The two classes of database solutions cater to different requirements. However having hybrid architecture may prove beneficial where the power of both RDBMS and NoSQL can be leveraged. Applications can use traditional RDBMS solutions for structured data (customers, products etc) and NoSQL databases for unstructured data (logs, comments, reviews etc). Your choice of a database solution will depend on your goals, whether you require transaction support, high availability, scalability, performance etc. It should also be considered that RDBMS have been around a long time while NoSQL is still a maturing technology. Most of the NoSQL solutions are open source and may not offer good support. As of now both SQL and NoSQL are going to co-exist as there is no one-size-fits-all solution.

Subscribe to this blog's feed

Follow us on

Blogger Profiles

Infosys on Twitter

Recent Posts