Hands-On: Customizing LLMs for Business Outcomes with AWS and Sagemaker

Course Description

TH4 Hands-On: Customizing LLMs for Business Outcomes with AWS and SagemakerNEW!

February 22, 2024

9:00 am - 5:00 pm

Duration: Full Day

Level: Intermediate to Advanced

Prerequisite: See below

Krish Krishnan

Founder

Sixth Sense Advisors Inc.

Large language models (LLMs) are trained on vast quantities of textual data, enabling them to perform a broad array of natural language processing (NLP) tasks, such as summarization, question answering, paraphrasing, and numerous others, with exceptional precision and effectiveness. Tremendous media attention has focused on these capabilities. How can a business with vast proprietary data leverage these models for highly precise and domain-specific answers? In this course, students will explore a variety of solutions using Amazon Sagemaker.

Instructor Krish Krishnan will guide students through four approaches to training LLMs on enterprise data. Students will set up the AWS platform with LLM Hub. Using OpenSearch data and Amazon Sagemaker, students will gain hands-on experience with each approach:

Prompt-based learning involves fine-tuning a LLM using factual knowledge that is represented as question-answer (prompt completion) pairs. This fine-tuning process is supervised and normally involves updating the model through gradient descent. This approach does not require a large amount of data and is run for a small number of epochs.

Domain adaptation modifies the LLM to align with the enterprise domain, producing responses that are domain-centric. The original base model is further trained in a self-supervised manner with domain-specific unlabeled data to update the model through gradient descent. This approach usually requires a larger amount of data, a custom vocabulary, and tokenizer.

Augmentation supplements the base LLM with external custom domain knowledge through information retrieval (IR). In this approach, a knowledge base containing domain-specific documents is used together with an IR mechanism that retrieves relevant pieces of information such as passages or sections, referred to as "context."

Vector search is an advanced approach in which textual data is transformed into semantically rich, contextualized embeddings via a text embedding model, enabling efficient and accurate information retrieval.

After this workshop, you will be ready to apply these skills—in your own business, with your own data, and on your own platforms—to bring the benefits of LLMs to your enterprise.

You Will Learn

Fundamentals of Amazon Sagemaker
LLM refresher and AWS Hub
Amazon Sagemaker endpoint configuration
OpenSearch data sets for LLM
Advanced setup and configuration for AWS and Sagemaker
LLM fine-tuning
Domain-centric LLM deployment
Context-driven LLM
Embeddings and Vector DB for deploying LLM
Exploring outcomes
Alternative choices

Geared To

ML engineers
Data scientists
Data and analytics developers
Data engineers
Architects

Prerequisites

Basic understanding of Python, SQL
AWS product familiarity

Laptop Setup

Students must bring their laptops to class.

Machine Requirements:

Windows or Max OS X
64-bit operating system
8 GB available RAM, 16 GB preferred

Setup:

Laptop setup is required BEFORE the conference. Instructions will be emailed to registrants before the event.

There is no time allotted in class for laptop preparation.

* Enrollment is limited to 40 attendees.

Back to previous page

View Agenda

Topics

Earn a Certificate

TDWI Las Vegas

TDWI Transform 2024

Las Vegas | Feb. 19–23