Multi-arm Bandits and Upper Confidence Bound (UCB)

Описание к видео Multi-arm Bandits and Upper Confidence Bound (UCB)

#AIResearch #75HardResearch #75HardAI #ResearchPaperExplained

In this video, we will see the Multi-arm bandit problem and the Upper Confidence Bound (UCB).

Link to the previous video:
Introduction to Reinforcement Learning and Planning (with running example)
   • Brief Intro to Reinforcement Learning...  


Chapters:
00:00 - Intro
00:30 - Multi-Arm Bandits
02:40 - Epsilon Greed Policy
05:45 - Exploration vs Exploitation
12:16 - Upper Confidence Bound
16:30 - Outro

Комментарии

Информация по комментариям в разработке