DDPG | Panda Robot Arm | Deep Reinforcement Learning

Описание к видео DDPG | Panda Robot Arm | Deep Reinforcement Learning

DDPG (Deep Deterministic Policy Gradient) is a reinforcement learning technique for continuous action spaces that combines Deep Q Learning and Policy Gradients. DDPG is an Actor Critic based algorithm, where the Actor learns the optimal policy to determine the next action in a state and the Critic acts a Q-value network to score the actions generated by the Actor. In this video, we apply the DDPG algorithm to the Robot Reacher task using the Panda Robot Arm.

Feel free to leave a comment or message me on Twitter/LinkedIn in case of any questions, doubts, suggestions or improvements.

Twitter:   / mahnasakshay  
LinkedIn:   / sakshaymahna  

Links
Notebook Code: https://www.kaggle.com/code/sakshayma...
Deep Reinforcement Learning Playlist:    • Deep Reinforcement Learning  
Panda Gym Environment: https://panda-gym.readthedocs.io/en/l...
DDPG Blog: https://towardsdatascience.com/deep-d...
DDPG Paper: https://arxiv.org/abs/1509.02971

Комментарии

Информация по комментариям в разработке