Open Source has revolutionized IT sector as it harnesses the intelligence and efforts of the community to develop and maintain software faster. Our experts discuss the latest happenings and give their view points on how open source can be leveraged for your organization’s transformation.

March 28, 2018

2018 Trends in Open Source

Open Source in years has been evolving from being Free software alternatives to being innovative in many areas. In the recent couple of years Open Source projects are leading the innovation and even defining standards for the future. For example, Cloud engineering and solutions is been led by Open Source projects. Similarly another example Machine Learning and AI is also been led by Open Source projects. My experience working with Open Source technologies, here are the few trends that will define 2018. 

Progressive Web Apps
Experience layer from last several years has been critical aspect of digital transformation. Omni channel including Web, Mobile, other channels, have become important as the critical capabilities for successful digital experience delivery.

AngularJS, React, React Mobile, Vue.js and related technologies are becoming more stable and improving. This trend will continue in 2018. More Native based mobile frameworks like React Mobile and similar frameworks will keep increasing to deliver value for the user experience.

Cloud Native Engineering & PAAS
Platform as a Service has taken a good shape and how its been positioned with in the Cloud Strategy for any enterprise. Standards based products like OpenShift, PCF, Kubernetes, Tectonic etc have moved from Container as a Service to Platform as Service. Cloud Native Computing Foundation is driving and bringing the open source innovative projects together to make the cloud strategy and migration easier for adoption.

OpenShift, PCF and other Kubernetes based products are graduating from being Container as Service(CAAS) to Platform as Service (PAAS). One trend that is seen is enterprises are adopting product based implementations like OpenShift, PCF, Tektonic etc. Other trend is enterprises building their own PAAS using the best breed open source projects together. Enterprises are looking for PAAS can give single window for the service catalogue not just on Private cloud but also native services with in the public cloud providers. Multi cloud support as key differentiator will drive the PAAS adoption.

Anything as a Service (XAAS)
Anything as a Service as an acronym that represents any service should be available over internet than on-premises. Earlier years saw SAAS, IAAS are adopted by the big enterprises, Network as a service, Monitoring as Service, Communication as a service, Storage as Service are becoming very popular and will continue to hog limelight.

Given the high speed networks, stable virtualization and container technologies combined together is giving the power for this concept to shape and will increase in adoption in 2018.

Open Source Databases adoption on the raise
Open Source RDBMS adoption will increase in 2018 as its been going on last few years. There are massive parallel processing products on top of the open source RDBMS are seeing a role in the use cases where horizontal scalability is critical.

EDB PostgreSQL, MariaDB keep enhancing the features on the horizontal scaling, cloud migration, support for database as service on a cloud etc. Migration from the proprietary database to Open Source traction will continue in 2018 like seen in last couple of years. Citus, Vitess kind of products are making ways into main stream with support for horizontal scaling, NoSQL kind of capabilities like Map reduce, Replication, Sharding with good cost effectiveness.

NoSQL adoption is at a raise as the cloud migrations and adoption increases. This will continue in 2018. Explosion in data ingestion from different parts of the enterprise and models with multi-structure, unstructured, flexi-structure increasing, NoSQL adoption will continue. Engagement Databases like Couchbase, Aerospike with memory first architecture providing faster access to data to the experience layer is seeing a great traction. MongoDB releasing the ACID capability in its new version this year will fuel in a great differentiator in the market. 

 Machine Learning and AI
Gartner predicts that Machine learning and AI scope will be increased more in 2018. Several green field areas will see a surge around ML and AI. Technology around the adoption of Machine learning with in enterprises is increasing with platform builds, new user development environment adoption and move from on-prem to a cloud based compute executions for the leaning systems.

New Open Source Intelligent Solutions are set to change the way systems and people will interact with each other. Conversational AI with question and answer paradigm, Chabots will become a medium of interaction. Autonomous vehicles and drones are set to be used more in the business. Glasses technology to become another user experience channel for making things easier for the users.

Real-time Integration and Data Analytics
Modern data analytics platform moving away from old fashioned ETL and integration. Real-time data streaming to modern ELT/ ETL programming models like Nifi with Kafka is becoming main stream and enterprises are ready to adopt.

Confluent with its new avatar as the company behind Kafka will give a great push to this technology. Going beyond messaging into more reactive architecture/design is increasing in the adoption with large enterprises. Nifi with its large file processing especially with Data Integration like ETL style on streams has increased real time data transformation a big possibility. 2018 could be the year for Nifi adoption at large scale. 

It will be an exciting year 2018 for Open Source!

October 30, 2017

Query-driven data modeling methodology for Apache Cassandra

This blog explains how to use query-driven data modeling in Apache Cassandra. NoSQL data modeling is a process that identifies entities as well as the relationships between them. It can be used to determine patterns when accessing data as well as the types of queries to be performed. In doing so, it reveals how data is organized and structured along with how database tables are designed and created. It is important to note that indexing the data can degrade the performance of queries. Hence, understanding indexing is essential in the data modeling process.

Data modeling in Cassandra focuses on the query-driven approach whereby specific queries are the key to organizing data. Let me first quickly explain these terms: Queries retrieve data from tables and schema defines how the data is arranged in the table. Thus, a query-driven database design facilitates faster reading and writing of data, i.e., the better the model design, the more rapid data is written and read.

Now, first, we must create a conceptual data model that will define all known entities, relationships, attribute types, keys, cardinality, and other constraints. This data model should be created in collaboration with business stakeholders and analysts. For example, a conceptual data model could be presented as ER-diagram.     

The next step is logical data modeling. Here, the conceptual data model is mapped to a logical data model based on queries that are defined in an application workflow. The logical data model corresponds to a keyspace where table schemas define columns as well as primary, partition and clustering keys. Thus, the query-driven approach provides a logical data model using data modeling principles, mapping rules, and mapping patterns.

Here are some rules for query predicates that ensure stability and efficiency:

    Only primary key columns should be used in the query predicate

    All partition key columns in the query predicate must have distinct values

    Clustering columns may be omitted in the query predicate

    All partition key(s) must be used in the predicate

Besides these query predicate rules, there are additional data modeling principles to map to logical data models. It is important to note that violating these principles and rules will affect the ability to support query requirements and may lead to loss of data and performance degradation.

Here are the fundamental principals of logical data modeling:

1.    Know your data, particularly entity and relationship type keys that are needed to be preserved and relied on to organize the data properly

2.    Know your queries such that all columns are preserved at the logical level

3.    Enable data nesting to merge multiple entities together based on a known criterion

4.    Minimize data duplication to ensure space and time efficiency

5.    Use equality search attributes to map to the prefix columns of the primary key

6.    Use inequality search attributes to map to the table clustering key column

7.    Use ordering attributes to map to clustering key columns with ascending or descending clustering order

8.    Use key attribute types to map to primary key columns and uniquely identify table rows

Finally, we must analyze and optimize this logical data model to create the physical data model. The above-mentioned modeling principles, mapping rules, and mapping patterns ensure correct and efficient logical schema. However, efficiency can still be impacted by database engine constraints or finite cluster resources such as typical table partition sizes and data duplication factors. There are some standard optimization techniques that can be used, including partition splitting, inverted indexes, data aggregation, and concurrent data access optimization. These methods can be used for optimizing the physical data model, although we will not be covering this topic in detail in this particular blog entry.

This is the way to go about enabling query-driven data modeling in Apache Cassandra.

Continue reading "Query-driven data modeling methodology for Apache Cassandra" »

October 27, 2017

The Rise of Open Source


"[Open source] gives customers control over the technologies they use instead of enabling the vendors to control their customers through restricting access to the code behind the technologies."

- Eric S Raymond, The Cathedral and the Bazaar


Open source adoption began long ago, in the 1990s, with the introduction of the Linux Kernel. Then, there were only about 100 developers who contributed code to Linux. Since then, the Linux community has exploded. Today, over 8000 developers and 800 companies contribute code to Linux. What has steered such rapid development and adoption of Open Source technologies ? I believe there are three key reasons for this. Open Source helps businesses:


  • Improve agility and accelerate business innovation
  • Gain greater flexibility and scalability
  • Realize higher cost savings


We are witnessing open source adoption across all the enterprise layers like experience, application, business, integration ,database and infrastructure layer. The adoption begins with the experience layer which is primarily driven by the need to quickly enhance customer experience. The adoption of open source JavaScript frameworks like Angular and React has made it easier and faster to develop responsive web designs. While increasing the customer experience, it has also helped lower maintenance needs. Open Source has also provided freedom of choice and helped move to a polyglot environment. It is no longer necessary to stick to a single technology vendor to realize the benefits of economies of scale. Easy access and reduced cost of usage has made it possible to leverage an ecosystem of technologies across the layers based on the use case and the application attributes.


Today, more than ever, open source adoption is critical for business success and many organizations are realizing how open source helps them stay ahead of the curve. For instance, a global retail and wholesale giant replaced their Siebel CRM with a new microservices-based application developed on an open source stack. The adoption of open source helped them accelerate their transformation towards an environment which is more agile and adaptive. While the architectural principles were decided and enforced centrally, this organization gave the individual LOBs the freedom to choose their technology stack as long as it is open source


Despite its advantages, open source also has its own challenges. Customers struggle to identify the right open source technology. Scaling adoption across the organization is difficult considering the dearth and high cost of talent. Finally, large transformational programs means managing a partner ecosystem to get the best of breed solution.


Infosys open source offerings helps clients address these challenges for seamless open source adoption.




More importantly, our expertise in delivering such transformation has been gained from vast experience across organizations with varied maturity levels. For example, while some of our clients want ad-hoc open source implementations to prove a concept or a technology, others want it only occasionally for selective lines of businesses. Then, there are those who consistently and strategically leverage open source across all lines of business to differentiate themselves and gain a clearer advantage.


While achieving 100% open source adoption may not always be feasible, we will see a confluence of hosted, proprietary and open source technologies. Open source will gain more traction and play a major part within enterprise architecture even as it co-exists with non-open source technologies. It is recommended to choose an enterprise-supported version of open source to ensure enterprise-grade security, reduce risk of failure, and guarantee response time.


We are already seeing the adoption of open source across industries including financial service organizations that were deemed the slowest to adopt open source. I believe that, in future, the demand for rapid digitization and cloud adoption will only accelerate the demand for and adoption of open source. Open source adoption is going to accelerate and organizations should assess how and where they can leverage it for competitive advantage.


Subscribe to this blog's feed

Follow us on

Blogger Profiles

Infosys on Twitter