Future BI: The Era of Open or Saleable Data

Big data was bound to be supplanted or augmented by another model -- open data.

In a new report, industry watcher Gartner Inc. argues that an open data strategy will permit companies to monetize their data assets by selling, trading, or exchanging them.

Simply put: if you have a data warehouse, you have the potential to use it to generate revenues.

The market watcher counsels clients to "investigate the types of data exchange now emerging where information producers and consumers share data for profit," according to a Gartner release. Research vice president David Newman cited the use of open data concepts or principles by government agencies, which he says "are now opening their data to the public Web to improve transparency." He also points out that "more commercial organizations are using open data to get closer to customers, share costs with partners, and generate revenue by monetizing information assets."

The concept of "saleable" data isn't new. Companies have been selling data for about as long as they've been collecting it. Saleable data isn't an innovation from the perspective of enterprise data management (DM), either: DM teams have been buying or consuming outside data -- typically in the form of subscription or list data from specialty providers, credit reporting agencies, public sector agencies, and other sources -- for decades now.

Open data is (or promises to be) something different, however. Everybody's creating and marketing data, so although the concept of saleable data itself isn't new, that of an information marketplace -- e.g., an App Store-like bazaar in which companies can buy, sell, and trade data -- is.

That said, Gartner's open data vision might not be as futuristic as it sounds -- and it might be much less ambitious than it could or should be.

According to Yves de Montcheuil, vice-president of marketing with open source software (OSS) data integration (DI) specialist Talend, companies are increasingly generating a lot of data.

This is far from a controversial contention, de Montcheuil concedes: it's one of the reasons big data has as much cachet as it does. There's another wrinkle here, too, he says: enterprises are collecting more data in more contexts, many of which are outside of their core markets or competencies. This data isn't (necessarily) grist for the kinds of data-driven analytics that businesses expect to use to more effectively compete against their rivals, or which might allow them to identify new sales opportunities or optimize their marketing.

This is data that has value precisely because a company is uniquely positioned -- either by virtue of its location, partners, suppliers, customers, or products, among other factors -- to collect it.

Although companies might use this data to help enhance -- or in some cases, to drive -- their internal analytic practices, they'll likewise look to sell it.

In some cases, de Montcheuil suggests, data could ultimately be more valuable as a saleable product than as grist for internal analysis.

"If you think about the [so-called] 'Internet of things,' where you have billions of devices connected to the Internet producing logs and generating data, this information is valuable," he comments, citing the example of dump trucks and other types of heavy construction vehicles, which are being outfitted with sensors -- each of which is assigned an IP address -- designed to generate telemetry information, logs, and other kinds of data.

"Most of the time, this data is not being harvested. I was writing an article ... about Ford, about how Ford has a huge potential for harnessing big data. Each of their cars is collecting gigantic mounds of data, except that this data remains in the car until it goes to the dealer," he continues. "Think of this car which is providing sensor information about just everything in the engine, and beyond the engine, could become like an intelligent sensor on the road. Ford could become a provider of traffic information, of air quality data, of weather information."

Harriet Fryman, director of business analytics software with IBM Corp., uses the example of California's FasTrak program, an electronic toll collection system that's similar to the E-Z Pass system used throughout the Northeastern United States.

"Its job was initially to speed up the traffic flow at tollbooths, but it collects a vast amount of information [in the process]. Just think about how useful that information is now," she argues. The point, she says, is that enterprises are in unique positions to collect information. That's one reason why she sees machine or sensor data as so potentially valuable.

At this summer's Pacific Northwest BI Summit, for example, she predicted that the "value of machine data will ultimately outweigh the value of social media data. There are many more sensors and many more objects that create information in our instrumented world."

Companies are already, albeit tentatively, exchanging some of the extraneous or supplementary information they're collecting. The radical shift, de Montcheuil argues, will occur as companies increasingly attempt to collect as much information -- in as many different contexts -- as they can, with a mind to productizing and selling it. "Fifteen years ago, there were only a few industries that had to use data from their partners. Today, we are all in turn dependent on one another. The data [we're collecting] is very diverse; it's almost pervasive," he concludes.

"Eventually, [enterprises will] be collecting data everywhere."

Evolution, Not Revolution

Where Gartner sees the dawning of a new era -- i.e., that of the "open data" epoch -- Industry veteran Mark Madsen, a principal with consultancy Third Nature Inc., sees the evolution of existing practices. Today, Madsen says, almost all enterprises consume data from third-party sources; some likewise package and resell data that they themselves have collected.

"Every company out there -- almost -- rents lists," he observes, noting that in many cases enterprises aren't even buying lists of personally identifiable information. As Joe Friday might put it: just the attributes, ma'am.

"[Enterprises] can cross-reference [purchased data] with their internal lists, augment it with any attributes that are useful from [a purchased] set, and build up interest profiles. It's interesting, and [probably] legal since you are sharing someone's data with a third party and they are keeping attributes but not the core record," Madsen comments.

To the extent that "open data" names an App Store-like information marketplace, it does mark a departure from the DM status quo, at least for most enterprises. That said, Madsen notes, some verticals (such as retail) already maintain proto-information marketplaces.

"Every retailer sells or at least trades their data to one or more data syndicators," he explains, noting that one of his previous employers routinely purchased such information from supermarket giant Whole Foods Inc. as well as one of its competitors.