By using website you agree to our use of cookies as described in our cookie policy. Learn More

TDWI Chicago Update

At TDWI, we have been working hard to navigate this ever-changing landscape in the face of COVID-19, and we want to assure you that the health and well-being of our employees, customers, and vendor partners is our top priority. Therefore, due to the growing concern around the coronavirus (COVID-19), and in alignment with the guidelines laid out by the CDC and WHO, we have decided to merge this year’s TDWI Chicago Conference (May 10-15) with TDWI Orlando 2020 (November 8-13), where it can be a successful experience for everyone. The Chicago 2020 agenda will be replicated at TDWI Orlando 2020.

Our registration team will be in contact with individual registrants and sponsors directly.

Course Description

DS2 Data Science Bootcamp // Data Sourcing and Preparation for Data Science

May 11, 2020

1:45 pm - 5:00 pm

Duration: Half Day Course

Level: Beginner to Intermediate

Prerequisite: None

Dean Abbott

Co-Founder/Chief Data Scientist

Smarter HQ, Inc.

You may have heard that data scientists spend 80 percent of their time sourcing, cleaning, and preparing data. While this may be an exaggeration (or not!)—data preparation is certainly a large and important part of data science and predictive analytics. The reason for this is that data often does not start out in the ideal format; it may contain bad values, it may not be easily accessible, or it may need to be transformed before we can really start exploring the data and building models. In this session, we will provide an overview of sourcing and preparing data for data science and predictive analytics projects. We will use a motivating example from the speaker’s work and also touch on how Python, SQL, and Hadoop can be used in the data preparation workflow.

Geared To

  • Anyone who is getting started in data science and is interested in learning more about data preparation. This includes BI and analytics professionals and managers that are exploring the broader world of data science. Nontechnical professionals are welcome as well. Intermediate to advanced professional data scientists will find this session to be a review for them.