Executive Summary: Managing Big Data
- By Philip Russom, Ph.D.
- October 1, 2013
The emerging phenomenon called big data is forcing numerous changes in businesses and other
organizations. Many struggle just to manage the massive data sets and non-traditional data
structures that are typical of big data. Others are managing big data by extending their data
management skills and their portfolios of data management software. This empowers them to
automate more business processes, operate closer to real time, and through analytics, learn valuable
new facts about business operations, customers, partners, and so on.
The result is big data management (BDM), an amalgam of old and new best practices, skills, teams,
data types, and home-grown or vendor-built functionality. All of these are expanding and realigning
so that businesses can fully leverage big data, not merely manage it. At the same time, big data must
eventually find a permanent place in enterprise data management.
BDM is well worth doing because managing big data leads to a number of benefits. According to
this report’s survey, the business and technology tasks that improve most are analytic insights, the
completeness of analytic data sets, business value drawn from big data, and all sales and marketing
activities. BDM also has challenges, and common barriers include low organizational maturity
relative to big data, weak business support, and the need to learn new technology approaches.
Despite the newness of big data, half of organizations surveyed are actively managing big data today.
For a quarter of organizations, big data mostly takes the form of the relational and structured data
that comes from traditional applications, whereas another quarter manages traditional data along
with big data from new sources such as Web servers, machines, sensors, customer interactions, and
social media.
A quarter of surveyed organizations have managed to scale up preexisting applications and databases
to handle burgeoning volumes of relational big data. Another quarter has gone out on the leading
edge by acquiring new data management platforms that are purpose-built for managing and
analyzing multi-structured big data. Many more are evaluating such big data platforms now, creating
a brisk market of vendor products and services for managing big data.
According to the survey, the Hadoop Distributed File System (HDFS), MapReduce, and various
Hadoop tools will be the software products most aggressively adopted for BDM in the next three
years. Others include complex event processing (for streaming big data), NoSQL databases (for
schema-free big data), in-memory databases (for real-time analytic processing of big data), private
clouds, in-database analytics, and grid computing.
Organizations are adjusting their technical best practices to accommodate BDM. Most are schooled
in extract, transform, and load (ETL) in support of data warehousing (DW) and reporting.
Preparing big data for analytics is similar, but different. Organizations are retraining existing
personnel, augmenting their teams with consultants, and hiring new personnel. The focus is on data
analysts, data scientists, and data architects who can develop the applications for data exploration
and discovery analytics that organizations need for getting value from big data.
This report accelerates users’ understanding of the many options that are available for big data
management (BDM), including old, new, and upcoming options. The report brings readers up to
date so they can make intelligent decisions about which tools, techniques, and team structures to
apply to their next-generation solutions for BDM.
Cloudera, Dell Software, Oracle, Pentaho, SAP, and SAS sponsored the research for this report.
About the Author
Philip Russom, Ph.D., is senior director of TDWI Research for data management and is a well-known figure in data warehousing, integration, and quality, having published over 600 research reports, magazine articles, opinion columns, and speeches over a 20-year period. Before joining TDWI in 2005, Russom was an industry analyst covering data management at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and consultant, was a contributing editor with leading IT magazines, and a product manager at database vendors. His Ph.D. is from Yale. You can reach him by email ([email protected]), on Twitter (twitter.com/prussom), and on LinkedIn (linkedin.com/in/philiprussom).