Course Hive
Search

Welcome

Sign in or create your account

Continue with Google
or
Learn to post-train LLMs in this free course
Play lesson

DeepLearning.AI Courses - Learn to post-train LLMs in this free course

5.0 (2)
18 learners

What you'll learn

This course includes

  • 5.5 hours of video
  • Certificate of completion
  • Access on mobile and TV

Summary

Full Transcript

Learn more: https://bit.ly/4lqtWmr Before a large language model can follow instructions, it undergoes two key stages: pre-training and post-training. In pre-training, it learns to predict the next word or token from large amounts of unlabeled text. In post-training, it learns useful behaviors such as following instructions, tool use, and reasoning. In our latest short course, Post-training of LLMs, you’ll learn how to use three of the most common post-training techniques: Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL), to reshape model behavior for specific tasks or capabilities. Taught by Banghua Zhu, Assistant Professor at the University of Washington, Principal Research Scientist at Nvidia, and co-founder of NexusFlow, this course covers: - When to apply post-training and how it compares to pre-training - How to curate and structure training data for each method - How to use SFT to turn a base model into an instruct model - How contrastive learning in DPO improves output quality - How to design reward functions for RL tasks like math or code - How to evaluate whether post-training improved or degraded model behavior You’ll also get hands-on experience implementing each technique with Hugging Face’s TRL library to: - Fine-tune a base model into an instruction-following assistant - Modify a model’s responses using preferred and rejected examples - Improve a model’s reasoning with online RL and verifiable rewards Whether you’re building safer assistants or targeting domain-specific improvements, this course will help you adapt LLMs with precision. Enroll now: https://bit.ly/4lqtWmr

Course Hive

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

FAQs

Course Hive
Download CourseHive
Keep learning anywhere