By using website you agree to our use of cookies as described in our cookie policy. Learn More


Your Team,
Our Instructors
Anywhere, Anytime

Hands-On: Text Analytics for Data Science with Python

Duration: One Day Course

Prerequisite: See below

Course Outline

This course can also be delivered using R.

Despite predictions to the contrary, textual data has grown exponentially in recent years. Whether coming from documents, blogs, social media posts, or customer service chats, not only has the volume of textual data increased but so has its potential value.

However, there’s a problem.

From the perspective of analytics and data science, textual data is unstructured. Text analytics techniques take the unstructured raw text data and transform it into representations useful for analysis and machine learning.

This course is a hands-on introduction to text analytics using Python. Your team will learn the fundamentals of building pipelines that clean and transform text documents into formats that can be fed to clustering and machine learning algorithms.

Text analytics is the foundation of some of the most recent advancements in artificial intelligence (AI), such as large language models like ChatGPT.

Although this course contains some mathematics, the level of math is accessible to a broad audience and the focus is on concepts, not calculations.

Your Team Will Learn

  • How to represent text documents using the bag of words model
  • Tokenization
  • Stopword removal
  • Stemming and lemmatization
  • Part-of-speech (POS) tagging
  • N-grams
  • Term frequency-inverse document frequency (TF-IDF)
  • How to group text documents based on similarity
  • How to classify text documents
  • Additional resources for honing skills

Geared To

  • Business and data analysts
  • BI and analytics developers and managers
  • Business users
  • Data scientists
  • Anyone interested in using text analytics with their business data

No background in advanced mathematics or statistics is required.


Students must be familiar with Python and Jupyter notebooks or complete the prerecorded course “Python Quick Start” prior to the class. This prerecorded course will be made available in advance to any students who need it.

Laptop Setup

Attendees will need a laptop computer with specific software installed before the session. In advance of the class, attendees will receive detailed software download and installation instructions.

Download the Course Catalog to Get Started Today

TDWI Course Catalog Download