Soft Actor Critic (V2)

Описание к видео Soft Actor Critic (V2)

This is the second version of a presentation of the Soft Actor Critic algorithm that I prepared together with Thomas Pierrot.

Note: a newer version exists, it is available here:
   • SAC and TQC (RLVS 2021 version)  

The corresponding slides are available here:
http://pages.isir.upmc.fr/~sigaud/tea...

A colab explaining the code of SAC in Stable-baselines 3 is available here:
https://colab.research.google.com/dri...

Feedback to improve all these elements is very welcome

Useful links:
The most recent SAC paper:
https://arxiv.org/pdf/1812.05905.pdf

The previous (NeurIPS) version:
https://arxiv.org/pdf/1801.01290.pdf

John Schulman's Deep RL Bootcamp video about the reparametrization trick:
   • Deep RL Bootcamp  Lecture 7  SVG, DDP...  

Many thanks to Thomas Pierrot and Nicolas Perrin for helping me understanding the algorithm.

Комментарии

Информация по комментариям в разработке