Operational Business Intelligence: Sorting Out Your Options
- By Wayne Eckerson
- January 14, 2010
Operational business intelligence (BI) means
many things to many people, but the nub is that it delivers
information to decision makers in near real time, usually
within seconds, minutes, or hours. The purpose is empower
front-line workers and managers with timely information so
they can work proactively to improve performance.
When you embark on an operational BI project, you must
make a key architectural decision. This sounds easy but
it's hard to do. Low latency or operational BI systems have
a lot of moving parts. There is not much time to recover
from errors, especially in high-volume environments. The
key decision you must make when architecting a low-latency
system is whether or not to use the data warehouse (DW).
The ramifications of this decision are significant.
On one hand, the DW will ensure the quality of low-
latency data. However, doing so may disrupt existing
processes, add undue complexity, and adversely impact
performance. On the other hand, creating a standalone
operational BI system may be simpler and provide tailored
functionality and higher performance. This choice means
you may create redundant copies of data that compromise
data consistency and quality.
Take your pick: either add complexity by re-architecting
the DW or undermine data consistency by deploying a
separate operational BI system. It's kind of a Faustian
bargain, and neither option is quick or cheap.
Within the DW
If you choose to deliver low-latency data within your
existing DW, you have three options:
- Mini Batch. With this option, you
choose simply to accelerate ETL jobs by running them more
frequently. If your DW supports mixed workloads (e.g.,
simultaneous queries and updates), this approach allows you
to run the DW 24x7. Many organizations start by loading the
DW hourly and then move to 15-minute load intervals, if
needed. Of course, your operational systems may not be
designed to support continuous data extracts, so this is a
consideration. Many companies start with this option since
it uses existing processes and tools and just runs them
faster.
- Change Data Capture. Another option is
to apply change data capture (CDC) and replication tools
that extract new records from system logs and move them in
real time to an ETL tool, staging table, flat file, or
message queue so they can be loaded into the DW. This
approach minimizes the impact on both your operational
systems and DW since you are only updating records that
have changed (instead of wiping the slate clean with each
load). Some companies combine both mini-batch and CDC to
streamline processes even further.
- Trickle-Feed. The last option is to
trickle-feed records into the DW directly from an
enterprise service bus (ESB) if your company has one. Here,
the DW subscribes to selected events that flow through a
staging area and into the DW. Most ETL vendors sell
specialized ESB connectors to trickle-feed data or you can
program a custom interface. This is the most complex of the
three approaches because there is no time to recover from a
failure, but it provides the most up-to-date data
possible.
Working Outside the DW
If your existing DW doesn't lend itself to operational
BI for architectural, political, or philosophical reasons,
you need to consider building or buying a complementary
low-latency decision engine. There are three options here:
- Data Federation. Data federation tools
query and join data from multiple source systems on the
fly. These tools create a virtual data mart that can
combine historical and real-time data without the expense
of creating a real-time DW infrastructure. Data federation
tools are ideal when the number of data sources, volume of
data, and complexity of queries are low.
- ODS. Companies often use operational
data stores (ODSes) when they want to create operational
reports that combine data from multiple systems and don't
want users to query source systems directly. An ODS
extracts data from each source system in a timely fashion
to create a repository of lightly integrated, current
transaction data. To avoid creating redundant ETL routines
and duplicate copies of data, many organizations load their
DW from the ODS.
- Event-driven Engines. Event-driven
analytic engines apply analytics to event data from an ESB
as well as static data in a DW and other applications. The
engine filters events, applies calculations and rules in
memory, and triggers alerts when thresholds have been
exceeded. Although tailored to meet high-volume, real-time
requirements, these systems can also support general
purpose BI applications.
The Last Word
In summary, you can architect operational BI systems in
multiple ways. The key decision is whether to support
operational BI inside or outside the DW. Operational BI
within a DW maintains a single version of the truth and
ensures high-quality data, but not all organizations can
afford to rearchitect a DW for low latency data and must
look to alternatives.
About the Author
Wayne Eckerson is an internationally recognized thought leader in the business intelligence and analytics field. He is a sought-after consultant and noted speaker who thinks critically, writes clearly, and presents persuasively about complex topics.
Eckerson has shared his insights and advice with a wide range of companies. Recent clients include AAA, Air Canada, Carlsberg, Boston Beer, New Balance, Avalon Bay Communities, and TE Connectivity. He has also conducted many groundbreaking research studies, chaired numerous conferences, and written two widely read books: The Secrets of Analytical Leaders: Insights from Information Insiders and Performance Dashboards: Measuring, Monitoring, and Managing Your Business.
Eckerson is founder and principal consultant of Eckerson Group, LLC, a business-technology consulting firm that helps business leaders use data and technology to drive better insights and actions. His team of senior researchers and consultants provide cutting-edge information and advice on business intelligence, analytics, performance management, data governance, data warehousing, and big data. They work closely with organizations that want to assess their current capabilities and develop a strategy for turning data into insights and action. He can be reached at [email protected].