Reinforcement Learning: ChatGPT and RLHF

Описание к видео Reinforcement Learning: ChatGPT and RLHF

Reinforcement Learning from human feedback, and how it's used to help train large language models like ChatGPT.
Part 3 of RL from scratch series.
   • Reinforcement Learning from scratch  

0:00 - intro
0:06 - large language models
0:35 - learning to tell jokes
1:13 - fine tuning with better data
1:26 - positive and negative examples
2:03 - reinforcement learning for LLMs
3:00 - labeling fewer examples
3:56 - reward networks
5:08 - summing it up
5:23 - variants
5:57 - chatGPT, Bard, Claude, Llama
6:09 - finally, a good joke!

Комментарии

Информация по комментариям в разработке