In general, robots need to learn how to act while existing in a world with other agents. In this lecture, I cover the foundations of multi-agent reinforcement learning (MARL) and connect this to the recent work on reinforcement learning from human feedback (RLHF). Topics include structured learning to reduce the complexity of the MARL problem, centralized and decentralized learning algorithms, how to learn from online feedback from humans, and learning how to train large LLMs from human preference data (aka RLHF).
Continue this lesson in the app
Install CourseHive on Android or iOS to keep learning while you move.