Level: Intermediate to Advanced
Prerequisite: None
Vector databases are the backbone of modern AI systems, enabling the semantic search and retrieval that power RAG architectures, recommendation engines, and intelligent applications. Yet many data and AI practitioners lack hands-on experience building and querying them. This full-day workshop closes that gap with practical, code-first exercises using Python, Chroma, and real-world data sets to reinforce every concept.
Participants will move from embedding fundamentals through similarity search, indexing strategies, and LLM integration, finishing with a working RAG pipeline. The course also addresses governance, monitoring, and enterprise considerations so attendees leave ready to apply vector database capabilities in production AI environments.
You Will Learn:
- Explain what vector embeddings are and how they represent meaning in AI systems
- Generate embeddings using open-source sentence-transformers and store them in a vector database
- Apply cosine similarity and ANN indexing concepts using FAISS for fast semantic retrieval
- Build a retrieval-augmented generation (RAG) pipeline connecting a vector database with an LLM
- Visualize vector clusters and embedding distance patterns using PCA and t-SNE
- Compare vector database architectures (Chroma, Pinecone, pgvector) and select the right fit for enterprise use cases
- Apply governance, logging, and monitoring best practices for AI observability and enterprise data controls
Geared To:
- Data engineers
- AI/ML engineers
- Data scientists
- Analytics practitioners
- Software developers building AI-powered applications
- Database administrators exploring AI integration
Prerequisites:
Registrants must be familiar with basic Python programming concepts or complete a complimentary online course before the conference. Access to the 3-hour online course "Python Quick Start" will be provided to registrants three weeks prior to the event.
Familiarity with data concepts and SQL is helpful but not required.
Laptop Setup:
Students must bring their own laptop to class with Python 3.9+ installed, along with access to Google Colab (recommended) or Jupyter Notebook
Note on Corporate Laptops
This course requires installation of software, access to web services APIs, and the ability to download data files, library files, and code.
If your corporate laptop blocks these activities, we recommend contacting your corporate IT department early. Alternatively, bring and use a personal device instead.
Setup
Laptop setup is required BEFORE the conference. Instructions will be emailed to registrants before the event.
There is no time allotted in class for laptop preparation.
* Enrollment is limited to 32 attendees.