Summary
Keywords
Full Transcript
Course website: http://bit.ly/DLSP20-web Playlist: http://bit.ly/pDL-YouTube Speaker: Mike Lewis Week 12: http://bit.ly/DLSP20-12 0:00:00 – Week 12 – Lecture LECTURE Part A: http://bit.ly/DLSP20-12-1 In this section, we discuss the various architectures used in NLP applications, beginning with CNNs, and RNNs, and eventually covering the state-of-the-art architecture, transformers. We then discuss the various modules that comprise transformers and how they make transformers advantageous for NLP tasks. Finally, we discuss tricks that allow transformers to be trained effectively. 0:00:44 – Introduction to deep learning in NLP and language models 0:13:48 – Transformer language model structure and intuition 0:32:55 – Some tricks and facts of Transformer Language Models and decoding Language Models LECTURE Part B: http://bit.ly/DLSP20-12-2 In this section, we introduce beam search as a middle ground between greedy decoding and exhaustive search. We consider the case of wanting to sample from the generative distribution (i.e. when generating text) and introduce “top-k” sampling. Subsequently, we introduce sequence to sequence models (with a transformer variant) and back-translation. We then introduce unsupervised learning approaches for learning embeddings and discuss word2vec, GPT, and BERT. 0:45:32 – Beam Search, Sampling and Text Generation 1:03:31 – Back-translation, word2vec and BERT's 1:22:43 – Pre-training for NLP and Next Steps
