August 18, 2014
Companies are using disparate data types more frequently to gain
value via analytics. TDWI research indicates that companies are
becoming more interested in enriching the traditional structured
data found in their data warehouses or marts with other kinds of
data. This might include demographic, geospatial, or even text data.
These companies realize that utilizing disparate data can improve
analysis by providing more attributes for discovery or improving
model performance. For example, a marketing department at a retail
chain trying to understand customer behavior to develop a promotion
plan might want to utilize standard transaction data from its data
warehouse (such as purchase type, amount spent) and combine that
with non-traditional data such as distance from store or census data
bought from a third party for developing a data set for analysis.
At the same time, analytic tools are becoming easier to use and
the business analyst is becoming a primary user of analytics. Data
access and integration are often stumbling blocks for business
analysts who may have a hard time accessing disparate data and
getting it ready for analysis. Doing this manually can take time or
require specific skills for data integration. Companies are looking
at ways to bring disparate and often dispersed data together in
an analytic data set to be explored and modeled without upfront
integration. In other words, they do not want to combine it into a
warehouse or data cube before starting to analyze it. This is often
referred to as data blending; i.e., combining data from multiple
sources without integrating it into a data warehouse or other system
of record. This kind of analysis is useful for discovery and analytics
that doesn’t necessarily lend itself to traditional reporting from an
enterprise data store.
Data blending isn’t simply a matter of throwing data together; issues
like data quality still need to be addressed. This Checklist Report
focuses on helping organizations understand the steps and features
that are part of data blending.