Introduction to Data Engineering (NEW)
Duration: One Day Course
Prerequisite: None
The first challenge in the machine learning life cycle is understanding the problem or opportunity; the next challenge is acquiring, understanding, and preparing data for the modeling phase. This step is estimated take more than 50% of the time allotted for a machine learning project. In this course, we address how to translate the problem statement, identify data sources, explore the data for relationships and patterns, identify the starting inputs for the model, prepare data, and validate it for the model-fitting process.
You Will Learn How To
- Understand the data science project methodology
- Understand data source identification (i.e., aligning data with the problem model)
- Evaluate data findings to determine and validate modeling techniques
- Review feature selection techniques
- Understand data preparation techniques (cleansing, formatting, and blending approaches)
- Plan for data pipelines (proactive and reusable data preparation)
- Understand data visualization techniques for data understanding and data preparation
Geared To
- Data engineers, data scientists, business and data analysts, project managers
- Roles that need insight into best practices and techniques related to data understanding and preparation