Infosys’ blog on industry solutions, trends, business process transformation and global implementation in Oracle.

« Smart features for managing shipping documents in Oracle Inventory Cloud | Main | EPM Agent Integration »

Oracle Stream Analytics

What is Data Stream:

Speed is one characteristic that drives the world now a days, whether it is downloading a big file, movie or working from home etc. Merely increasing speed is not sufficient, storage increase demand is also continuously rising. A solution was offered by services like Netflix/Spotify to consume content directly into handheld devices without downloading with exceptional speed. These services made it possible to send and receive billions of bytes of data. Due to continuous flow of data like jet of water, these services are called as streaming services. Today data streaming exists in many forms; audio, video, media streaming is just one part of it. Humongous growth in data and advancements in engineering processes led to different ways of gathering, analyzing and processing the data. Due to this it was possible to provide instantaneous analysis of the streamed data.

Why Stream analytics:

Streaming analytics or Real-Time analytics is an emerging type of analytics that sources data in real time, performs simple operations or calculations real time in order to provide business insight of fast moving data.  It is quite different from traditional Warehousing ETL techniques, in traditional techniques business calculations are performed on a batch of data overnight, however in real-time analytics operations like filtering aggregation, grouping etc. are performed on a stream of continuous flowing data. Huge amount of data is flowing from one system to another system every minute. It is observed that organizations which can act on the stream of data are able to improve their operational efficiency. A wide range of industries can take advantages by issuing real time alerts with the help of real time data stream analytics. These alerts can be different type including promotional alerts, fraud detection alerts or informative alerts etc. Data stream analytics is highly scalable, low cost, high throughput and reliable solution. Data Stream analytics is cloud based service, making it as low cost solution in which organizations pay as per usage. Streaming analytics is primarily a cloud solution provided by multiple vendors like Microsoft, Oracle, Amazon etc.

Oracle Stream Analytics (OSA):

Oracle Stream Analytics is a big data based real time tool which uses in-memory engine technologies for real time stream data analytics. Data streams can source data from applications from different areas like sensing equipment, Banking Point of Sales, ATMs, Twitter or any other social media, Traditional Databases or Data Warehouses etc.  OSA offers a web-based, user-friendly streaming analytics for business users. Users can dynamically develop, design and implement instant analytical solutions which give insight of streaming real time data. One of the best advantages of the tool is that it allows user to explore the data with different advanced visualizations like charts, maps, geographic markings etc.  OSA uses Apache Kafka and Apache Streams integrated with Oracle's engine in order to address the real time requirements and analytical challenges of the users.

Components:

Stream: As the name suggest stream specifies the source of flowing data (not static, continuously changing). The data can be sourced from stock market, JMS Server, REST APIs, Twitter etc. This data or stream of data changes with every passing second and is fed to Oracle Stream analytics for processing.

Reference: Reference is the source of data which is referred for fetching some information about the event data. It can provide contextual information about the flowing data in stream. It can be static database tables or static excel or csv files. In this release of OSA only oracle tables are supported for reference.

Exploration: Business rules or set of criteria defined for exploring and managing the event data. Exploration applies filters on data, group data by different groups, provide summary of the event etc. An already configured data can be added or attached to an exploration.

Topology Viewer: Topology viewer provides a graphical representation that showcase the dependencies amongst different entities. Immediate Family and Extended family are the two contexts supported by OSA topologies. Immediate family identifies the dependencies between parent and child, however Extended family identifies the dependencies in full context.

Pattern: Based on common business scenarios a simple way to explore event streams is referred as Pattern.

Map: Geo fencing collection is referred as Maps, it is used to locate the geographic coordinates specified from different sources like GPS.

Shape: Shape is the representation of event data in different forms like charts, pie graphs etc.

OSA Architecture:

First step in OSA is to ingest data from applications, golden gate change data capture method from Kafka. After that examining and analyzing is performed on the sourced stream by using data pipelines. In Data pipeline data is queried, business or conditional logics are applied, patterns are identified on the data streams. All these operations are performed when data is flowing and not stored anywhere. Continuous Query language(CQL) is used for querying data. CQL is similar to SQL, it contains additional constructs for pattern matching and recognition. OSA generate the query and spark stream automatically. Once the analysis is complete using data pipeline data can be fed in data lake for deeper insight analysis or any other integration trigger/alerts can be sent immediately.  A high level architecture is as shown below: 


arch.png


Few reasons why OSA should be used against its competitors are:

Simplicity: It is simple to use web-based tool which doesn't require much technical skills. It can also generate and validate some of the most powerful data pipelines automatically.

Apache Spark: OSA is built on apache spark which give the flexibility to attach itself to any compliant yarn cluster. It is the first tool in market to introduce event by event processing on a spark streaming.

Enterprise Grade: OSA can scale out horizontally and highly available (24*7) for critical workload pipelines makes it an enterprise level tool. In-built governance ensures no data loss at any point in time.

Industry Advantages:

Risk and Fraud Management - Financial industry uses stream analytics to detect the fraud on the PoS or online by analyzing the data streams.

Transportation and Logistics - OSA can help in managing fleet, tracking assets, and help in improving supply chain efficiencies.

Customer Experience and Consumer Analytics - Knowing the sentiment of the customers is the key in releasing offers, analyzing trends etc. OSA can play a crucial role in analyzing the customer trends.

Telecommunications - OSA can help in proactively monitoring the networks. It can also predict network failures and help achieving high availability.

Retail -- Instant shopping trends, shelf arrangements for benefits, customer cart utilization response can be achieved with OSA to increase the sales in retail industry.

 

References: https://docs.oracle.com/middleware/1221/osa/using-streamanalytics/GUID-81454309-164B-4933-B972-A9FEF06159D0.htm#OEPSX127

 


Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Blogger Profiles