January 21, 2015
Data science is a hot topic among business and IT leaders.
Excitement about the potential benefits of data science is tempered,
however, by anxiety about how hard it is to find, hire, and train data
science personnel, not to mention the difficulty of defining the term
within the context of an organization’s goals and objectives.
There is no single definition of data science, nor one solution or
technology. It is a term that joins together contributions from several
fields, including statistics, mathematics, operations research,
computer science, data mining, machine learning (algorithms that can
learn from data), software programming, and data visualization. It
can cover the entire process of acquiring and cleaning data, methods
for exploring the data and extracting value from it, and techniques for
making insights actionable for humans and automated processes.1
Most often, the focus of data science is to optimize decisions and
realize higher value from data through advanced analysis.
One factor that makes data science distinct, however, is the word
science. Data science is about applying scientific methods to explore
and test hypotheses about the data. Indeed, many data scientists
come from hard science fields such as chemistry and physics or
professions such as neurobiology and nuclear physics. Data science
pioneers have contributed mightily to the growth of social media and
e-commerce; now, firms in other industries are keen to apply data
science to their decision-making processes.
Continuous experimentation through examination of data to test
hypotheses is at the heart of most data science projects. At the same
time, the availability of technologies that can work with enormous
data volumes and variety enables professionals to complement
scientific methods with hypothesis-free approaches that employ
machine learning to examine data and discover unforeseen patterns
before articulating a hypothesis. This enables organizations to use
data science to find previously hidden risks and opportunities and
apply analytics to improve outcomes.
To solve business problems, develop new products and services, and
optimize processes, organizations increasingly need analytics insights
produced by data science teams with a diverse set of technical skills
and business knowledge who are also good communicators. This TDWI
Checklist Report describes seven steps to achieve a successful data
science strategy.