Reinforcement Learning 2: Markov Decision Processes

Описание к видео Reinforcement Learning 2: Markov Decision Processes

This lecture uses the excellent MDP example from David Silver.

Slides: https://cwkx.github.io/data/teaching/...
Colab: https://colab.research.google.com/gis...
Twitter:   / cwkx  
Next video:    • Reinforcement Learning Lectures  

Content:
Markov Chains
markov property
state transition matrix
definition and example
Markov Reward Process
definition and example
the return
state value function
the Bellman equation
Markov Decision Process
definition and example
policies
state and action value functions
the Bellman equation for MDPs
optimal state and action value functions
the Bellman optimality equations

#MDPs #MRPs #markovchains #reinforcementlearning #BellmanEquations #BellmanOptimality

Комментарии

Информация по комментариям в разработке