Operational Intelligence: Footloose and Schema-free
        
        Because big data is fast-paced and unpredictable, operational intelligence outstrips the capabilities of traditional BI or DW tools, argues Vitria CTO Dale Skeen.
        
			- By Stephen Swoyer
- April 30, 2013
Data from sources such as sensors and machines can lose value as it ages.  That's why landing it in a traditional data warehouse (DW) -- or even in a place such as Hadoop -- doesn't make much sense, says Dale Skeen, founder and CTO of operational intelligence (OI) specialist Vitria Technology Inc.
"Hadoop is not on our critical path for processing," he explains, adding that Vitria can and does use Hadoop as a historical repository. 
"We can take everything out of Hadoop [as we need it]. We're storing [data in Hadoop] for historical context -- to enable offline analytics. That's our Hadoop story." 
Vitria specializes in OI, which is why Skeen discounts the Hadoop landing-zone scenario, at least for time-sensitive data. Data is time-sensitive if it contains clues about an imminent event; this could be anything from impending hardware failure to a glitch or malfunction in a network of devices, applications, and services. 
"For operational intelligence, there are a number of use cases, mostly dealing with machine data, that are event-centric," he indicates, invoking the example of a telco customer: "We might be monitoring network data, but the key that we provide is the ability to combine and correlate machine data with customer data with data from the business processes."
Skeen cites Vitria customer TXU, which presented at TDWI's recent Executive Summit in Las Vegas. "In order to be provisioned, [a new customer] would have to pass through many a bunch of different systems, enacting different business processes -- most of them SAP," he explains. "Owing to a series of broken business processes within SAP, or between SAP and other applications, this [i.e., provisioning] would never happen. In fact, what they discovered once they brought us in was that they had a high volume of hidden processes, too. They're hidden because they're hard-coded in various applications."
Low-level Visibility
Provisioning new customer accounts involves synchronous and asynchronous choreography between and among different systems. Sometimes a company may have excellent visibility at the physical level -- i.e., the system or network level -- and less useful visibility at the business process level, particularly if processes involve interactions between systems or applications. Sometimes the reverse is the case. The challenge is knitting together -- synthesizing -- information from a fabric of different sources, Skeen explains. 
Vitria offers connectors for SAP and Oracle -- the kinds of applications that are usually embedded in (or which arguably constitute) business processes. In TXU's case, it was able to install Vitria's SAP connector and start sucking up information.
Platform-wise, Vitria uses a real-time analytics engine -- i.e., a complex event processing (CEP) engine -- that it pairs with its business process management (BPM) technology. The CEP engine permits it to capture and analyze event or message traffic from devices, sensors, machines, and other sources. Its new Vitria 4.0 release includes both a desktop modeling and design component -- Vitria Analyst Workbench -- and a toolkit for developing "OI" applications for mobile devices. The former is geared toward business users, says Skeen: it provides a drag-and-drop facility for modeling processes and feeds. 
Vitria doesn't use a traditional database or data warehouse as a repository; instead, it uses what Skeen calls an "elastic grid," which is also new to Vitria 4.0.
"This is elastic scalabilty, which is similar to what Hadoop offers, except we do this for data in motion. We do different types of optimizations for this [i.e., data in motion]," he explains. 
"We're schema-less. Regardless of whether [the data we're consuming is in the form of] flexible schemas, richer schemas, [or] information that's partially structured -- we use third-party tools to help mine it for meaning. If it's partially structured or richly structured, all of the structure and semantics are inherently exposed in the data itself," Skeen indicated. 
He says traditional business intelligence or data management tools are ill-suited for the fast-moving -- and unpredictable -- world of OI. 
"Traditional tools have been built on the relational model where you have to know the schema in advance; you have to have it structured in this way," he says. "It's a good paradigm; it makes things efficient, but we're past that: it's [based] on the premise that you can know your data -- that you can know your structures -- in advance, and that you can control everything."
In practice, Skeen concludes, customers opt to store streaming data in and out of Vitria. Even if Hadoop isn't an ideal platform for continuous real-time analytics, it is ideal for historical analyses involving staggering volumes of data. That's how Vitria uses it. 
"As soon as the stream comes in, we split it. We take it through the operational intelligence store [i.e., the Vitria platform] and also into the big data store [i.e., Hadoop]," he says. "In the big data store, we can do predictive data mining techniques over all of this historical data. Once we discover … predictive patterns, what we can do is upload it into the operational intelligence [store] … and continuously monitor [for] those patterns in real time."