TDWI Nashville 2023 has ended, sorry you missed it!

Join us for an upcoming conference, Executive Summit, or check out our full calendar of training opportunities.

By using website you agree to our use of cookies as described in our cookie policy. Learn More

Course Description

TH3 Hands-On: Introduction to Text Analytics for Data Science with PythonNEW!

May 18, 2023

9:00 am - 5:00 pm

Duration: Full Day Class

Level: Intermediate to Advanced

Prerequisite: See below

David Langer

Expert in Machine Learning, Python, and R


Dave on Data

This course is a hands-on introduction to text analytics using Python. Attendees will learn the fundamentals of building pipelines that clean and transform text documents into formats that can be fed to clustering and classification machine learning algorithms.

Although this course contains some mathematics, the level of math is accessible to a broad audience and the focus is on concepts, not calculations.

You Will Learn

  • Representing text documents using the bag of words model
  • Tokenization
  • Stopword removal
  • Stemming and lemmatization
  • Part-of-speech (POS) tagging
  • N-grams
  • Term frequency-inverse document frequency (TF-IDF)
  • Grouping text documents based on similarity
  • Classifying text documents
  • Additional resources for honing skills

Geared To

  • Business and data analysts
  • BI and analytics developers and managers
  • Business users
  • Anyone interested in using text analytics with their business data


Registrants must be familiar with Python and Python Notebooks or complete a complimentary online course before the conference. Access to the four-hour online course "Python Quick Start" will be provided to registrants three weeks prior to the event.

No background in advanced mathematics or statistics is required.

Laptop Setup

Students must bring their laptops to class.

Machine Requirements:

  • Windows or Max OS X
  • 64-bit operating system
  • 8 GB available RAM, 16 GB preferred
  • 4 GB of HD space for Anaconda Python installation

Anaconda Python is used in this course because it is free, easy to install, and has all the needed libraries.


Laptop setup is required BEFORE the conference. Instructions will be emailed to registrants before the event.

There is no time allotted in class for laptop preparation.

* Enrollment is limited to 40 attendees.

The clock is ticking.

Register Now

Register Online

Rest easy—online registrations for this conference are secure. Our secured server environment keeps your information private.