Summary
Keywords
Full Transcript
Sebastian's books: https://sebastianraschka.com/books/ Slides: https://sebastianraschka.com/pdf/lecture-notes/stat453ss21/L11_norm-and-init__slides.pdf BatchNorm papers: Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International Conference on Machine Learning (pp. 448-456). http://proceedings.mlr.press/v37/ioffe15.html Santurkar, S., Tsipras, D., Ilyas, A., & Madry, A. (2018). How does batch normalization help optimization?. In Advances in Neural Information Processing Systems (pp. 2488-2498). https://arxiv.org/abs/1805.11604 Morcos, A. S., Barrett, D. G., Rabinowitz, N. C., & Botvinick, M. (2018). On the importance of single directions for generalization. https://arxiv.org/abs/1803.06959 Luo, P., Wang, X., Shao, W., & Peng, Z. (2018). Towards understanding regularization in batch normalization. https://arxiv.org/abs/1809.00846 Yang, G., Pennington, J., Rao, V., Sohl-Dickstein, J., & Schoenholz, S. S. (2019). A mean field theory of batch normalization. https://arxiv.org/abs/1902.08129 ============== Some Benchmarks: https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md#bn----before-or-after-relu ------- This video is part of my Introduction of Deep Learning course. Next video: https://youtu.be/RsX01aYbQdI The complete playlist: https://www.youtube.com/playlist?list=PLTKMiZHVd_2KJtIXOW0zFhFfBaJJilH51 A handy overview page with links to the materials: https://sebastianraschka.com/blog/2021/dl-course.html ------- If you want to be notified about future videos, please consider subscribing to my channel: https://youtube.com/c/SebastianRaschka
