Actor-Critic Model Predictive Control (ICRA 2024)

Описание к видео Actor-Critic Model Predictive Control (ICRA 2024)

An open research question in robotics is how to combine the benefits of model-free reinforcement learning (RL) - known for its strong task performance and flexibility in optimizing general reward formulations - with the robustness and online replanning capabilities of model predictive control (MPC). This paper provides an answer by introducing a new framework called Actor-Critic Model Predictive Control. The key idea is to embed a differentiable MPC within an actor-critic RL framework. The proposed approach leverages the short-term predictive optimization capabilities of MPC with the exploratory and end-to-end training properties of RL. The resulting policy effectively manages both short-term decisions through the MPC-based actor and long-term prediction via the critic network, unifying the benefits of both model-based control and end-to-end learning. We validate our method in both simulation and the real world with a quadcopter platform across various high-level tasks. We show that the proposed architecture can achieve real-time control performance, learn complex behaviors via trial and error, and retain the predictive properties of the MPC to better handle out of distribution behaviour.

Reference:
A. Romero, Y. Song, D. Scaramuzza,
"Actor-Critic Model Predictive Control",
IEEE International Conference on Robotics and Automation, 2024
PDF: https://rpg.ifi.uzh.ch/docs/ICRA24_Ro...

For more info about our research on:
Agile Drone Flight: http://rpg.ifi.uzh.ch/aggressive_flig...
Drone Racing: http://rpg.ifi.uzh.ch/research_drone_...
Machine Learning: http://rpg.ifi.uzh.ch/research_learni...

Affiliations:
A. Romero, Y. Song, and D. Scaramuzza are with the Robotics and Perception Group, Dep. of Informatics, University of Zurich, and Dep. of Neuroinformatics, University of Zurich and ETH Zurich, Switzerland
http://rpg.ifi.uzh.ch/

Music Credits: scottholmesmusic.com under Free Creative Commons License

Комментарии

Информация по комментариям в разработке