Understanding STaR and how it powers Claude and Gemini/Gemma 2 (and maybe OpenAI Q* or Strawberry)

Описание к видео Understanding STaR and how it powers Claude and Gemini/Gemma 2 (and maybe OpenAI Q* or Strawberry)

Understanding STaR and how it powers Claude and Gemini/Gemma 2B (and maybe Q* or Strawberry). STaR is short for Self-Taught Reasoning and is rumored to power OpenAI's Q* (now Strawberry), but definitely powers Claude 3.5 sonnet and Gemma / Gemini models. In this video Chris breaks down how Self Taught reasoning works and how it is used in the fine tuned phases of a model to improve training. Chris also shows how you can use NVidia Nemotrons reward model to judge the outputs for STaR. If you want to understand how to use the same techniques that frontier AI models such as Anthropic Claude and Google Gemini / Gemma use to improve their fine tuning, then check out this video

Комментарии

Информация по комментариям в разработке