DeepMind x UCL RL Lecture Series - Multi-step & Off Policy [11/13]

Описание к видео DeepMind x UCL RL Lecture Series - Multi-step & Off Policy [11/13]

Research Scientist Hado van Hasselt discusses multi-step and off policy algorithms, including various techniques for variance reduction.

Slides: https://dpmd.ai/offpolicy
Full video lecture series: https://dpmd.ai/DeepMindxUCL21

Комментарии

Информация по комментариям в разработке