Ensembling is one of the hottest techniques in today’s predictive analytics competitions. Every single recent winner of Kaggle.com and KDD competitions used an ensemble technique, including famous algorithms such as XGBoost, Random Forest, and "Deep Stacking".
Are these competition victories paving the way for widespread organizational implementation of these techniques? Yes, but not entirely. We will walk through an effective and practical approach to ensembling most applicable to organizational problems, attainable by analytics practitioners, and adoptable by leadership.
This course will provide a detailed overview of ensemble models and their origin—and show why they are so effective. We will explain the building blocks of virtually all ensemble techniques: bagging, boosting, and stacking.
While not a prerequisite, attending the "Decision Trees in Machine Learning" course provides a great foundation of machine learning and supervised learning techniques prior to this session.
What You Will Learn
- What are ensemble models and what are their advantages?
- Why are ensembles in the news?
- Three influential ensembling approaches and three famous algorithms
- The core elements of ensembles and their application – bagging, boosting, and stacking
- How to apply “meta-modeling” to real-world problems
- The pros and cons of complex "black box" techniques in solving business problems
- The challenge of applying competition strategies to organizational problems
- Case Study: Using an ensemble to address systematically missing data
- Analytics practitioners; data scientists; IT professionals; technology planners; consultants; business analysts; analytics project leaders