Level: Beginner to Intermediate
If you want to build the most useful machine learning models, you need to use the best data. This idea is summarized in a famous quote: “Data trumps algorithm.” In this hands-on course, you will learn some of the most useful data wrangling techniques for producing the best data to use in your machine learning models.
Via a series of labs, you will get hands-on experience wrangling data using R via libraries like dplyr and lubridate. This course is designed for a broad audience, and no prior knowledge of R programming is required.
The goal of this course is for you to return to work and start employing these techniques to wrangle your own data, enhance your data analyses, and craft the most useful machine learning models.
Want to know the best part?
Although the course uses R because it is free, all the concepts/techniques you will learn are applicable to any machine learning technology you might use.
You Will Learn
- How to wrangle data in R the tidyverse way
- Working with character data
- Wrangling date and time data
- Pivoting/aggregating tables of data
- Joining tables of data
- Strategies for dealing with missing data
- Additional resources to extend your learning
- Business/data analysts
- Database developers
- BI/report developers
- Anyone interested in building machine learning models
No skills in programming or statistics are required!
- Windows or Mac OS X
- 64-bit operating system
- 8 MB of available RAM
- 1 GB available on hard drive to download and install R, RStudio, and required libraries
Instructions will be emailed to registrants prior to the event to prepare your laptop BEFORE the conference. There is no time allotted in class for laptop preparation.
* Enrollment is limited to 40 attendees.