Summary
Keywords
Full Transcript
AdaBoost is one of those machine learning methods that seems so much more confusing than it really is. It's really just a simple twist on decision trees and random forests. NOTE: This video assumes you already know about Decision Trees... https://youtu.be/_L39rN6gz7Y ...and Random Forests.... https://youtu.be/J4Wdy0Wc_xQ Sources: The original AdaBoost paper by Robert E. Schapire and Yoav Freund https://www.sciencedirect.com/science/article/pii/S002200009791504X And a follow up by co-created Schapire: http://rob.schapire.net/papers/explaining-adaboost.pdf The idea of using the weights to resample the original dataset comes from Boosting Foundations and Algorithms, by Robert E. Schapire and Yoav Freund https://mitpress.mit.edu/books/boosting Lastly, Chris McCormick's tutorial was super helpful: http://mccormickml.com/2013/12/13/adaboost-tutorial/ For a complete index of all the StatQuest videos, check out: https://statquest.org/video-index/ If you'd like to support StatQuest, please consider... Patreon: https://www.patreon.com/statquest ...or... YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join ...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store... https://statquest.org/statquest-store/ ...or just donating to StatQuest! https://www.paypal.me/statquest Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter: https://twitter.com/joshuastarmer 0:00 Awesome song and introduction 0:56 The three main ideas behind AdaBoost 3:30 Review of the three main ideas 3:58 Building a stump with the GINI index 6:27 Determining the Amount of Say for a stump 10:45 Updating sample weights 14:47 Normalizing the sample weights 15:32 Using the normalized weights to make the second stump 19:06 Using stumps to make classifications 19:51 Review of the three main ideas behind AdaBoost Correction: 10:18. The Amount of Say for Chest Pain = (1/2)*log((1-(3/8))/(3/8)) = 1/2*log(5/8/3/8) = 1/2*log(5/3) = 0.25, not 0.42. #statquest #adaboost
