Predictive analytics (PA) has emerged as a go-to approach to creating data-driven business decisions. The science of PA is not new nor are the algorithms commonly used in PA. What is new is how organizations are leveraging predictive techniques and insights to drive business value.
This tutorial will describe tasks performed by data engineers and predictive models who want to build data for predictive modeling. We will cover three of the six stages of predictive analytics as defined in CRISP-DM, the Cross-Industry Process Model for Data Mining: Business Understanding, Data Understanding, and Data Preparation, Throughout the tutorial, concepts will be illustrated with data and real use cases.
You Will Learn
- How to audit data for predictive modeling
- How to clean data, including filling missing data and removing or mitigating the effects of outliers
- How to create new features and derived attributes
- How to select features and reduce the number of candidate inputs for predictive models
- How to sample data for predictive modeling
- Business analysts, data analysts, and data scientists who need or want to learn how to get started with preparing data for predictive modeling; experienced analysts who want to expand their understanding of practical data preparation tips and tricks.