Emerging Technologies and Methods for Streaming Analytics
Getting the greatest business value from streaming data requires new approaches to real-time operations and analytics.
- By Philip Russom, Ph.D.
- December 9, 2014
According to TDWI's 2013 survey on managing big data, roughly half of user organizations said they are already managing and leveraging streaming data that's generated frequently or continuously by sensors, machines, geospatial devices, and Web servers. However, most of these users are merely capturing and storing streaming data for offline study, whereas they need to adopt the emerging technologies and methods for "streaming analytics" – that is, the real-time, continuous analysis of streaming data. This would enable user organizations to analyze streaming data as it arrives, then take immediate action for the highest business value.
Consider some of today's real-world use cases for streaming analytics:
- Monitor and maintain the availability, performance, and capacity of interconnected infrastructures, such as utility grids, computer networks, and manufacturing facilities
- Understand customer behavior across multiple channels, so you can improve the customer experience as it's happening
- Identify compliance and security breaches, then halt and/or correct them immediately
- Spot and stop fraudulent activity, even as fraud is being perpetrated
- Evaluate sales performance in real time and achieve sales quotas through instant incentives such as discounts, bundles, free shipping, and easy payment terms
Such compelling use cases typically result from a "perfect storm" of emerging data types, software technologies, and fast-paced business methods:
Streaming data: The swelling collection of sensors worldwide (plus the extended "Internet of Things") produces large volumes of streaming data that can be leveraged for business advantage. For example, robots have been in use for years in manufacturing, but now they have additional sensors so they can perform quality assurance, not just assembly. For decades, mechanical gauges have been common in many industries such as chemicals and utilities. Now gauges are replaced by digital sensors and "smart meters" to provide real-time monitoring and analysis. GPS and RFID signals now emanate from mobile devices and assets -- ranging from smart phones to trucks to shipping pallets -- so all these can be tracked in real time and controlled precisely.
Streaming analytics: The growing consensus is that analytics is the most direct path to business value drawn from new forms of big data, which includes streaming data. Existing analytic techniques -- based on mining, statistics, predictive algorithms, queries, scoring, clustering, and so on -- apply well to machine data once it's captured and stored. Luckily, newer vendor tools are reengineering these and creating new analytic methods so they can operate on data that streams continuously as well as on other data in storage.
Continuous analytics: Most analytic operations are scheduled to run on a 24-hour or longer cycle. To get the most out of streaming data, however, requires analytics that executes or updates every few seconds or milliseconds, thereby processing each event, message, record, transaction, or log entry as it arrives in case the new data signals a business event that requires immediate attention. In other words, continuous analytics goes hand-in-hand with streaming data. Imagine the results of a query incrementally updated with each new event without needing to rerun the query against all pertinent data. Likewise, continuous analytics may rescore an analytic model, recalculate a statistic, remap a cluster, and so on -- but as efficient, incremental updates, not execution from scratch.
Complex event processing (CEP): Event-processing technology over streaming data has been around for decades, and a recent TDWI Best Practices Survey shows that more than 20% of organizations surveyed are conducting event processing today in their DW/BI solutions. However, traditional event processing tends to be very simple, monitoring one stream of data at a time. The newer practice of CEP can monitor multiple streams at once while correlating across multiple streams, correlating streaming data with data of other vintages, and continuously analyzing the results.
Operational intelligence (OI): OI is a new form of business analytics that delivers visibility and insight into business operations and similar processes as they are happening. This new class of enterprise software includes all the capabilities discussed above but in a unified tool that empowers users to explore data streams, understand business processes (as seen via data), model processes, write rules for event-driven alerts and responses, and create full-blown business monitoring and surveillance applications. When these applications run and respond continuously in real time, it's called continuous operational intelligence.
For more information about the emerging technologies and methods of streaming analytics, read the 2014 TDWI Checklist Report: Using Streaming Analytics for Continuous Operational Intelligence.