Today, the world of open source is witnessing major innovations, and at a considerably high pace. Intense focus from academia, Open Source communities (Hive, Spark, Apache, etc.) as well as industry leaders are making this possible, to a large extent. In addition, with service providers and other IT players adequately providing support on such open source components, enterprises are increasingly shifting their workloads to an open data architecture. Particularly, enterprises are making this move as data is becoming exponential - in terms of volume, variety and velocity - and license costs for proprietary data platforms are rising.
They are beginning to experience unprecedented cost-performance benefits with open-source technologies compared to proprietary systems. For instance, Spark, which uses in-memory capabilities of the underlying Hadoop with its own map reduce, is 100 times faster than a traditional Hadoop-based system.