Course Hive
Search

Welcome

Sign in or create your account

Continue with Google
or
Transformers, the tech behind LLMs | Deep Learning Chapter 5
Play lesson

IA - Transformers, the tech behind LLMs | Deep Learning Chapter 5

5.0 (0)
8 learners

What you'll learn

This course includes

  • 6.5 hours of video
  • Certificate of completion
  • Access on mobile and TV

Summary

Full Transcript

Breaking down how Large Language Models work, visualizing how data flows through. Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support --- Here are a few other relevant resources Build a GPT from scratch, by Andrej Karpathy https://youtu.be/kCc8FmEb1nY If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic: https://youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources. https://transformer-circuits.pub/2021/framework/index.html History of language models by Brit Cruise, @ArtOfTheProblem https://youtu.be/OFS90-FX6pg An early paper on how directions in embedding spaces have meaning: https://arxiv.org/pdf/1301.3781.pdf Звуковая дорожка на русском языке: Влад Бурмистров. --- Timestamps 0:00 - Predict, sample, repeat 3:03 - Inside a transformer 6:36 - Chapter layout 7:20 - The premise of Deep Learning 12:27 - Word embeddings 18:25 - Embeddings beyond words 20:22 - Unembedding 22:22 - Softmax with temperature 26:03 - Up next

Course Hive

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

FAQs

Course Hive
Download CourseHive
Keep learning anywhere