By using tdwi.org website you agree to our use of cookies as described in our cookie policy. Learn More

Hands-On Machine Learning Bootcamp

A TDWI Certificate Track

Virtual Classroom
June 24–26, 2024
9:00am – 5:00pm CT

virtual seminars

Hands-On: Introduction to Cluster Analysis for Data Science with Python

June 26, 2024

9:00 am - 5:00 pm

Central Standard Time CST

Prerequisite: None

David Langer

Founder

Dave on Data

Cluster analysis is a type of machine learning that splits data into groups (i.e., clusters) that are meaningful, useful, or both. Cluster analysis finds meaningful groups when the underlying structure of the data is discovered and surfaced. Cluster analysis finds useful groups when the groupings are used as inputs into another process (e.g., classification).

Cluster analysis is one of the most useful of all machine learning techniques. Cluster analysis has broad applicability across industries and business domains. Cluster analysis is used in scenarios such as:

  • Customer segmentation
  • Anomaly detection
  • Document classification
  • Supply chain optimization

This class is an introduction to cluster analysis using Python. Attendees will learn how to employ two of the most popular clustering techniques (k-means and DBSCAN) to craft new insights from data via hands-on labs.

Although this course contains some mathematics, the level of math is accessible to a broad audience and the focus is on concepts, not calculations.

You Will Learn

  • How cluster analysis differs from other forms of machine learning
  • Use cases for cluster analysis
  • The k-means clustering algorithm
  • Optimizing the number of k-means clusters
  • The DBSCAN clustering algorithm
  • Optimizing the clusters found by DBSCAN
  • Handling categorical data with one-hot encoding
  • Handling categorical data with factor analysis of mixed data (FAMD)
  • Additional resources for honing skills

Geared To

  • Business and data analysts
  • BI and analytics developers and managers
  • Business users
  • Anyone interested in using cluster analysis with their business data

Prerequisites

Registrants must be familiar with Python and Python Notebooks or complete a complimentary online course before the conference. Access to the four-hour online course "Python Quick Start" will be provided to registrants three weeks prior to the event.

No background in advanced mathematics or statistics is required.

Laptop Setup

Students must bring their laptops to class.

Machine Requirements:

  • Windows or Max OS X
  • 64-bit operating system
  • 8 GB available RAM, 16 GB preferred
  • 4 GB of HD space for Anaconda Python installation

Anaconda Python is used in this course because it is free, easy to install, and has all the needed libraries.

Setup:

Laptop setup is required BEFORE the conference. Instructions will be emailed to registrants before the event.

There is no time allotted in class for laptop preparation.

* Enrollment is limited to 35 attendees.

Subscribe to receive seminar updates via email

Hands-On Machine Learning Bootcamp

June 24–26, 2024