TDWI Articles

On the Use and Abuse of Analytics

We need to be critical of analytics insights and recognize that analytics are not absolute representations of an objective business reality.

Over time, the use of analytics will change how a business structures and manages its day-to-day operations, defines and assesses its priorities and performance metrics, and develops and executes on its long-term strategy.

During a presentation at the last Pacific Northwest BI and Analytics Summit, industry veteran Donald Farmer discussed a few of the issues we need to keep in mind as we take up and use predictive analytics, deep learning, artificial intelligence (AI), and other advanced analytics technologies. (Farmer is a principal with information management consultancy TreeHive Strategy and often teaches at TDWI events. He led an "Ask the Experts" session on data literacy for TDWI members on September 14, 2017.)

One of these issues concerns the nature of analytics.

Analytics as Notation

Analytics is a way of describing, of representing, something about the business and its world. In Farmer's language, analytics is a kind of business "notation" -- analogous to the use of symbol-notations on sheet music that denote the melodies, rhythms, chords, and tempo of a musical piece. "An analytic[al] business ... actually operates on two levels. There's the basic level which is the business [itself], the life of the business, [for example, an airline] moves people from one place to another," he told attendees.

"Analytics [by contrast] describes the business. It's [a means of] describing and investigating ... the business. Analytics actually notates the business. It's a form of business notation."

If you can read business sheet music, Farmer says, you can translate the symbols of this notation into actionable intelligence. However, in business (as in performance), this doesn't mean the music you're playing is an absolute facsimile of the business as it exists -- or the music as it was originally conceived by the composer.

In the first place, notation can't capture everything. It doesn't pretend to. Even though advanced analytics technologies permit us to capture "more" -- to model a richer, more realistic, more actionable business world -- the business activities they're describing or predicting are nonetheless only approximations of business reality. Furthermore, they're approximations of business reality as we've known it.

Data Is Interpretation, Too

Farmer also told attendees that data itself is only an approximate representation of events, conditions, circumstances, etc.

This might sound noncontroversial, but Farmer's claim was met with skepticism. Most of us are used to thinking of data as impersonal, as objective -- as a pure representation of something about reality. The data is the data, we say; the data doesn't lie. It just is. It's only when we interpret data that problems arise, we tell ourselves.

"Data is always a symbolic representation. When you're dealing with data, you're not dealing with the real world. There are always assumptions between the real world and the collection, aggregation, and analysis of data," Farmer pointed out.

He cited the way we model and analyze clickstream data as one example: "How we note the experience and classify [the events] and represent [them] is all a construction."

Gartner analyst Merv Adrian, who was also in attendance, noted that Internet-of-Things (IoT) data is an even better example. For the purposes of upstream analysis, temperature, pressure, and other nominally "objective" metrics are not, properly speaking, objective, Adrian suggested: context is always presupposed.

"The things the metric [is supposed to] represent ... might not even be the same things" across different sensors, he pointed out.

The data from different IoT streams isn't necessarily commensurable -- even if it purports to be describing the same conditions or phenomena. It isn't just that (for example) pressure measurements might be encoded in different formats -- pounds per square inch (psi), versus pascals (Pa) -- it's that certain kinds of metrics tend to be used for certain kinds of applications. These metrics are used to measure very different things: gauge pressure as distinct from absolute pressure or differential pressure as distinct from sealed pressure.

Far from being "objective," this data presupposes a context. In many cases, it also embodies as-yet-unknown biases.

The Priority of Human Oversight and Discretion

Measurement in some contexts (e.g., temperature and other physical quantities) is better understood, or more predictable, than in others. Metrics that correspond to real-world business contexts, by contrast, are less likely to be well understood. Their predictive power is limited to specific events -- e.g., customer churn, intent to buy, likelihood of fraud, etc.

Unlike the technology we use to capture and measure temperature data, the signals and phenomena we use to detect customer churn will become less reliable over time. This is a function of change. Analytics are terrible at identifying, much less predicting, disjunctive change.

Why does this matter? According to Farmer, it's important to be critical of analytics insights and recognize that analytics are not absolute representations of an objective business reality. In the first place, this recognition will help guide our decision making and make it possible for us to manage our expectations about what analytics can and can't achieve for us. In the second place, this recognition will help us determine how we should and shouldn't use analytics. Just because we can automate decision making in certain contexts -- a job-candidate-screening process, for example -- should we?

If so, which kinds of decisions should we automate? Which decisions should still require human oversight and/or approval? In which contexts is human discretion appropriate or desired? In which contexts should the use of machine analytics be tightly controlled?

To recognize that analytics insights are not infallible representations of an objective business reality isn't to reject or deemphasize the value of these insights. It's to encourage the responsible and efficacious use of analytics. "If you're aware that it's a construction, that enables the possibility of [improving it]," Farmer pointed out.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.