Streaming Toward the Future
From continuous analytics to operational intelligence to good old complex event processing, the future is one of streams: lots and lots of streams.
- By Stephen Swoyer
- July 22, 2014
Two recent reports, from International Data Corp. (IDC) and TDWI Research, suggest that businesses are already collecting lots of streaming data, analyzing some of it, and (most important) thinking about What Comes Next: about shifting from traditional retroactive analytics to a new kind of proactive analysis.
"Businesses are taking the necessary steps to gain a deeper understanding of IoT [the Internet of things] and the overall value," said Vernon Turner, a senior vice president with IDC, in a statement. Turner co-authored a new IDC report, Worldwide and Regional Internet of Things (IoT) 2014--2020 Forecast: A Virtuous Circle of Proven Value and Demand, that projects that the market for IoT offerings will increase from an estimated $1.9 trillion today to $7.1 trillion by 2020.
The report from TDWI Research, Using Streaming Analytics for Continuous Operational Intelligence, notes that a 2013 TDWI survey (Managing Big Data) found that almost half of organizations are already collecting streaming data, i.e., information generated by machines, sensors, Web applications, and other resources. That's the good news. The not-so-good news is that businesses aren't doing as much with this information as they could or should be.
"[M]ost of these users are today merely capturing and storing streaming data for offline study, whereas they need to mature [what they're doing] by using real-time practices and technologies. This would enable them to analyze streaming data as it arrives, then take immediate action for the highest business value," writes Philip Russom, research director with TDWI Research, in the new report.
Russom anticipates that this will change over time. The potential uses and applications for real- or right-time analytics -- among which is "operational intelligence" (OI), touted by vendors such as Splunk Inc. and Vitria Inc. -- are simply too compelling, he argues. These include: the capacity to monitor, maintain, and proactively respond to anomalies in interconnected infrastructures, such as utility grids, computer networks, and manufacturing; the ability to track and analyze customer behavior across multiple channels in (or close to) real time; the capacity to identify compliance and security issues -- or (a separate application) to detect fraudulent activity -- and respond to it as it's taking place; and an ability to evaluate sales performance in real time, which should make it easier to meet quotas as well as to appropriately target and tailor incentives.
"The growing consensus is that analytics is the most direct path to business value drawn from new forms of big data, which includes streaming data," Russom continues. "Existing analytic techniques -- based on mining, statistics, predictive algorithms, queries, scoring, clustering, and so on -- apply well to machine data once it's captured and stored. Luckily, newer vendor tools are reengineering these and creating new analytic methods so they can operate on data that streams continuously."
In addition to the established practice of complex event processing, or CEP, Russom cites another emerging application, the so-called "continuous analytics," which aims to radically compress the 24- or 12-hour batch windows of traditional analytical workloads.
"Getting the most out of streaming data ... requires analytics that execute or update every few seconds or milliseconds to process each event, message, record, transaction, or log entry as it arrives in case the new data signals a business event that requires immediate attention. In other words, continuous analytics go hand-in-hand with streaming data," he writes. "Imagine the results of a query incrementally updated with each new event without needing to rerun the query against all pertinent data. Likewise, continuous analytics may rescore an analytic model, recalculate a statistic, remap a cluster, and so on but as efficient, incremental updates, not execution from scratch."
The difference between continuous analytics and OI could best be described as one of standardization and scope. Continuous analytics tends to be insight-specific: e.g., it involves the periodic scheduling of analytic workloads (sometimes every few milliseconds) that sift through samples to identify known phenomena -- e.g., events, anomalies, etc. A continuous analytic workload might stop with the identification of an event or anomaly, or it might kick off one or more epi-processes (i.e., rules or decision trees that help automate a response to the detection of an event or anomaly). OI aims to provide a single context in which to do this -- as well as to collect, integrate, synthesize, and analyze information from streaming sources, ERP applications, legacy or custom applications, Web applications, and so on.
OI aspires to connect a power outage in Waco, Texas with a hiccup in a new-customer provisioning system in Fort Worth. (It isn't just a question of identifying the source of an outage and responding to it; it's the chain reaction of problems that an isolated incident -- such as a power outage -- might cause, especially now that processes are orchestrated across distributed systems and networks.) Continuous analytics, then, has a bounded, strictly delimited context; OI aims for something more: call it Big Context.
"OI is a new form of business analytics that delivers visibility and insight into business operations and similar processes, as they are happening," Russom writes.
"This new class of enterprise software includes all the capabilities discussed above, but in a unified tool that empowers users to explore data streams, understand business processes (as seen via data), model processes, write rules for event-driven alerts and responses, and create full-blown business monitoring and surveillance applications," he points out. "When these applications run and respond continuously in real time, you have continuous operational intelligence."
You can download a copy of Russom's TDWI report, free of charge, at http://tdwi.org/research/list/tdwi-checklist-reports.aspx.