Course Hive
Search

Welcome

Sign in or create your account

Continue with Google
or
Build an LLM from Scratch 4: Implementing a GPT model from Scratch To Generate Text
Play lesson

Build a Large Language Model (From Scratch) - Build an LLM from Scratch 4: Implementing a GPT model from Scratch To Generate Text

5.0 (1)
32 learners

What you'll learn

This course includes

  • 12.3 hours of video
  • Certificate of completion
  • Access on mobile and TV

Summary

Keywords

Full Transcript

Links to the book: - https://amzn.to/4fqvn0D (Amazon) - https://mng.bz/M96o (Manning) Link to the GitHub repository: https://github.com/rasbt/LLMs-from-scratch This is a supplementary video explaining how to code an LLM architecture from scratch. 00:00 4.1 Coding an LLM architecture 13:52 4.2 Normalizing activations withlayer normalization 36:02 4.3 Implementing a feed forward network with GELU activations 52:16 4.4 Adding shortcut connections 1:03:18 4.5 Connecting attention and linear layers in a transformer block 1:15:13 4.6 Coding the GPT model You can find additional bonus materials on GitHub, for example converting the GPT-2 architecture into Llama 2 and Llama 3: https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/07_gpt_to_llama

Course Hive

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

FAQs

Course Hive
Download CourseHive
Keep learning anywhere