The Era Of Open Source Platforms Is Here
Today, the world of open source is witnessing major innovations, and at a considerably high pace. Intense focus from academia, Open Source communities (Hive, Spark, Apache, etc.) as well as industry leaders are making this possible, to a large extent. In addition, with service providers and other IT players adequately providing support on such open source components, enterprises are increasingly shifting their workloads to an open data architecture. Particularly, enterprises are making this move as data is becoming exponential - in terms of volume, variety and velocity - and license costs for proprietary data platforms are rising.
They are beginning to experience unprecedented cost-performance benefits with open-source technologies compared to proprietary systems. For instance, Spark, which uses in-memory capabilities of the underlying Hadoop with its own map reduce, is 100 times faster than a traditional Hadoop-based system.
Not without challenges
According to Gartner, "73% of companies surveyed have invested or were planning to invest in Big Data in 2014. Of these 65% are struggling to derive value from data". There are many reasons for this. Although Open Source innovations have increased, so has the number of programming paradigms associated with them. This means that with every new technology, there is a ramp up and ramp down of models already created, re-training of IT staff as well as management of newer versions. The support that comes along with skilled staff is also scarce. The governance requirements of enterprises and the need for monitoring mechanisms, increases the implementation time. While the enterprises are keen to derive actionable insights from their data as quickly as possible and consistently, such issues delay the returns for enterprises with open data architecture.
Any Big Data system must be able to include the most recent data into analysis. Can we pose our questions to the data that is recent, even real-time? How do we minimize the distance between the raw data and the business that is relevant to it? This actually depends on two aspects. One, how quickly you can push the data from your transactional system into your Big Data system. Two, once the data is available, how quickly you can transform it to the end models that answer all business questions.
Opening up to these challenges
At Infosys, we have tried to address many of the issues by investing in the Infosys Information Platform (IIP). It has a rich repertoire. With its single-click Installer, data ingestion framework and graphical data modelling tool, IIP provides out-of-the-box adapters to diverse data sources and an easy way to build new adapters when needed. Out-of-the-box integration with R studio makes it easy to leverage the power of clusters while modelling data. All this can be achieved without writing any open source code. So enterprises can focus on business outcomes without having to hire high-paid open source experts. They can even change the runtime paradigm in a non-disruptive manner, if they want to switch between various open source paradigms to take advantage of emerging innovations in the world of Open Source.
Proof of the pudding
IIP's list of successes has been growing rapidly. Just to cite two instances: Its predictive maintenance model reduced the number of machine breakdowns for a mining company, with significant improvements in its downstream supply chain operations. For a leading supermarket chain, IIP helped derive key business insights such as top 10 categories of products by profit and sales, hourly sales trends, and category-wise profit contribution margin and sales variance, in less than four weeks.
Only this week, IIP added another success story - actually a milestone. It is now available on Amazon Web Services Marketplace (AWS Marketplace), bringing with it all the speed, flexibility, and affordability of a cloud-based platform. America's top chocolatier, Hershey's, has already used IIP on AWS and derived benefits from revenue-generating insight faster than traditional analytics.
As a true-blue technologist, it is very exciting for me to see this era unfold - better than that, to participate in its making.