Summary
Keywords
Full Transcript
Let's deep dive into the transformer encoder architecture. ABOUT ME β Subscribe: https://www.youtube.com/c/CodeEmporium?sub_confirmation=1 π Medium Blog: https://medium.com/@dataemporium π» Github: https://github.com/ajhalthor π LinkedIn: https://www.linkedin.com/in/ajay-halthor-477974bb/ RESOURCES [ 1π] My playlist for all transformer videos before this: https://www.youtube.com/watch?v=QCJQG4DuHT0&list=PLTl9hO2Oobd97qfWC40gOSU8C0iu0m2l4 [ 2 π] Transformer Main Paper: https://arxiv.org/abs/1706.03762 PLAYLISTS FROM MY CHANNEL β ChatGPT Playlist of all other videos: https://youtube.com/playlist?list=PLTl9hO2Oobd9coYT6XsTraTBo4pL1j4HJ β Transformer Neural Networks: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE β Convolutional Neural Networks: https://youtube.com/playlist?list=PLTl9hO2Oobd9U0XHz62Lw6EgIMkQpfz74 β The Math You Should Know : https://youtube.com/playlist?list=PLTl9hO2Oobd-_5sGLnbgE8Poer1Xjzz4h β Probability Theory for Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9bPcq0fj91Jgk_-h1H_W3V β Coding Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd82vcsOnvCNzxrZOlrz3RiD MATH COURSES (7 day free trial) π Mathematics for Machine Learning: https://imp.i384100.net/MathML π Calculus: https://imp.i384100.net/Calculus π Statistics for Data Science: https://imp.i384100.net/AdvancedStatistics π Bayesian Statistics: https://imp.i384100.net/BayesianStatistics π Linear Algebra: https://imp.i384100.net/LinearAlgebra π Probability: https://imp.i384100.net/Probability OTHER RELATED COURSES (7 day free trial) π β Deep Learning Specialization: https://imp.i384100.net/Deep-Learning π Python for Everybody: https://imp.i384100.net/python π MLOps Course: https://imp.i384100.net/MLOps π Natural Language Processing (NLP): https://imp.i384100.net/NLP π Machine Learning in Production: https://imp.i384100.net/MLProduction π Data Science Specialization: https://imp.i384100.net/DataScience π Tensorflow: https://imp.i384100.net/Tensorflow TIMESTAMPS 0:00 Introduction 0:28 Encoder Overview 1:25 Blowing up the encoder 1:45 Create Initial Embeddings 3:54 Positional Encodings 4:54 The Encoder Layer Begins 5:02 Query, Key, Value Vectors 7:37 Constructing Self Attention Matrix 9:44 Why scaling and Softmax? 10:53 Combining Attention heads 12:46 Residual Connections (Skip Connections) 13:45 Layer Normalization 16:36 Why Linear Layers, ReLU, Dropout 17:46 Complete the Encoder Layer 18:46 Final Word Embeddings 20:04 Sneak Peak of Code
