DeepMind x UCL RL Lecture Series - Model-free Control [6/13]

Описание к видео DeepMind x UCL RL Lecture Series - Model-free Control [6/13]

Research Scientist Hado van Hasselt covers prediction algorithms for policy improvement, leading to algorithms that can learn good behaviour policies from sampled experience.

Slides: https://dpmd.ai/modelfreecontrol
Full video lecture series: https://dpmd.ai/DeepMindxUCL21

Комментарии

Информация по комментариям в разработке