Course Hive
Search

Welcome

Sign in or create your account

Continue with Google
or
New course with StatQuest with Josh Starmer! Attention in Transformers: Concepts and Code in PyTorch
Play lesson

DeepLearning.AI Courses - New course with StatQuest with Josh Starmer! Attention in Transformers: Concepts and Code in PyTorch

5.0 (2)
18 learners

What you'll learn

This course includes

  • 5.5 hours of video
  • Certificate of completion
  • Access on mobile and TV

Summary

Full Transcript

Learn more: https://bit.ly/4jK250b This course clearly explains the ideas behind the attention mechanism. It walks through the algorithm itself and how to code it in Pytorch. Attention in Transformers: Concepts and Code in PyTorch, was built in collaboration with StatQuest, and taught by its Founder and CEO, Josh Starmer. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper "Attention is All You Need" by Ashish Viswani and others, revolutionized AI with their scalable design. Learn how this foundational architecture works to improve your intuition about building reliable, functional, and scalable AI applications. What you’ll do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, how to produce them, and how to use them in attention. - Go through the math required to calculate self-attention and masked self-attention to learn how and why the equation works the way it does. - Understand the difference between self-attention and masked self-attention, and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention, and how they are incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. Enroll now: https://bit.ly/4jK250b

Course Hive

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

FAQs

Course Hive
Download CourseHive
Keep learning anywhere