Course Hive
Search

Welcome

Sign in or create your account

Continue with Google
or
RobotLearning: Scaling Continuous Deep QLearning Part1
Play lesson

Robot Learning 2025: Foundational Models for Robotics and Scaling DeepRL - RobotLearning: Scaling Continuous Deep QLearning Part1

4.0 (3)
32 learners

What you'll learn

This course includes

  • 34.5 hours of video
  • Certificate of completion
  • Access on mobile and TV

Summary

Keywords

Full Transcript

I explain DDPG as an early deterministic policy gradient method, transitioning from Deep Q-learning, which doesn't work for continuous actions. I detail how we approximate maximum actions using a mu model, sample batches for IID, and use target networks for training. I explain the policy update via gradient computation through the Q-function and the Polyac averaging used for target network updates, noting its empirical success but questioning its theoretical contraction. I then delve into practical challenges, like preventing policy outputs from exploding to infinity, and solutions such as regularization, tanh activations, and gradient squashing. We discussed exploration noise, comparing Gaussian and Ornstein-Uhlenbeck noise, and the choice between discrete and continuous action spaces, emphasizing the importance of multimodality. I highlight the sensitivity of Q-functions to policy inputs and how adding noise to target values, as in TD3, improves robustness and performance. I question the continued use of simple environments like the inverted pendulum for algorithm evaluation, advocating for more complex tasks to better differentiate algorithm performance and reflect real-world challenges, much like progressing from simple addition to complex homework assignments in our studies. Finally, we cover a number of recent papers and their exploration to find more scalable versions of deep Q learning methods using mixtures of experts, layer normalization, and network structure.

Course Hive

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

FAQs

Course Hive
Download CourseHive
Keep learning anywhere