Can AI Learn to Cooperate? Multi Agent Deep Deterministic Policy Gradients (MADDPG) in PyTorch

Описание к видео Can AI Learn to Cooperate? Multi Agent Deep Deterministic Policy Gradients (MADDPG) in PyTorch

Multi agent deep deterministic policy gradients is one of the first successful algorithms for multi agent artificial intelligence. Cooperation and competition among AI agents is going to be critical as applications of deep learning expand in our daily lives. In this tutorial, we are going to read through the paper together and then code up the entire multi agent actor critic algorithm from scratch in the Pytorch framework.

The main innovation of this algorithm is the use of centralized execution and decentralized training. In brief, we’re going to give each agent’s critic network access to the observations and actions of all the agents in the simulation. The actor networks will only have access to their own perspective, hence the centralized execution.

We are going to use Open AI’s multi agent particle environment for training and testing our agents. I’ll show you how to get it from github and install the requirements in a virtual environment. We’ll cover some of the ways in which the new environments differ from the classic Open AI gym environments, and then we’re off to coding our agents.

You can read along with the paper here:
https://arxiv.org/pdf/1706.02275.pdf

You can find the environment here:
https://github.com/openai/multiagent-...

Code for this tutorial is here:
https://github.com/philtabor/Multi-Ag...

Learn how to turn deep reinforcement learning papers into code:

Get instant access to all my courses, including the new Prioritized Experience Replay course, with my subscription service. $29 a month gives you instant access to 42 hours of instructional content plus access to future updates, added monthly.


Discounts available for Udemy students (enrolled longer than 30 days). Just send an email to [email protected]

https://www.neuralnet.ai/courses

Or, pickup my Udemy courses here:

Deep Q Learning:
https://www.udemy.com/course/deep-q-l...

Actor Critic Methods:
https://www.udemy.com/course/actor-cr...

Curiosity Driven Deep Reinforcement Learning
https://www.udemy.com/course/curiosit...

Natural Language Processing from First Principles:
https://www.udemy.com/course/natural-...
Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/rei...

Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion: https://bit.ly/3fXHy8W
Grokking Deep Learning: https://bit.ly/3yJ14gT
Grokking Deep Reinforcement Learning: https://bit.ly/2VNAXql

Come hang out on Discord here:
  / discord  

Need personalized tutoring? Help on a programming project? Shoot me an email! [email protected]

Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter:   / mlwithphil  

time stamps:
0:00 Intro
02:28 Abstract
03:18 Paper Intro
08:13 Related Works
09:02 Markov Decision Processes
10:42 Q Learning Explained
15:25 Policy Gradients Explained
19:14 Why Multi Agent Actor Critic is Hard
20:15 DDPG Explained
24:21 MADDPG Explained
29:11 Experiments
37:57 How to Implement MADDPG
42:54 MADDPG Algorithm
42:23 Hyperparameters for MADDPG
43:42 Multi Agent Particle Environment
45:09 Environment Install & Testing
55:37 Coding the Replay Buffer
01:07:34 Actor & Critic Networks
01:15:42 Coding the Agent
01:26:05 Coding the MADDPG Class
01:39:23 Coding the Utility Function
01:42:13 Coding the Main Loop
01:46:58 Moment of Truth
01:52:09 Testing on Physical Deception
01:55:48 Conclusion & Results

Комментарии

Информация по комментариям в разработке