Logo video2dn
  • Сохранить видео с ютуба
  • Категории
    • Музыка
    • Кино и Анимация
    • Автомобили
    • Животные
    • Спорт
    • Путешествия
    • Игры
    • Люди и Блоги
    • Юмор
    • Развлечения
    • Новости и Политика
    • Howto и Стиль
    • Diy своими руками
    • Образование
    • Наука и Технологии
    • Некоммерческие Организации
  • О сайте

Natasha Jaques

I'm an assistant professor at the University of Washington and a Staff Research Scientist at Google DeepMind. I give talks about my research on AI, machine learning, and reinforcement learning. If you want to learn more, check out my website https://natashajaques.ai/.

3 - Personalized RLHF
3 - Personalized RLHF
2  -  Deep RL and RL post-training intro
2 - Deep RL and RL post-training intro
1 - Введение
1 - Введение
Self Play for Safety - Online Multi-Agent Adversarial Training for Provably Robust LLMs
Self Play for Safety - Online Multi-Agent Adversarial Training for Provably Robust LLMs
What Makes ChatGPT Chat? Modern AI for the layperson
What Makes ChatGPT Chat? Modern AI for the layperson
Reinforcement Learning (RL) for LLMs
Reinforcement Learning (RL) for LLMs
Social Reinforcement Learning talk at RLDM
Social Reinforcement Learning talk at RLDM
Badly trained policy after 40000 steps
Badly trained policy after 40000 steps
Multi-agent DQN training step 90000 trajectory video
Multi-agent DQN training step 90000 trajectory video
Multi-agent DQN training step 0 trajectory video
Multi-agent DQN training step 0 trajectory video
Learning to grab with bell as reward
Learning to grab with bell as reward
Intel Deep Learning Community of Practice talk
Intel Deep Learning Community of Practice talk
Natasha Jaques PhD Thesis Defense
Natasha Jaques PhD Thesis Defense
Personalized Multi-task Learning for Predicting Tomorrow's Mood, Stress, and Health
Personalized Multi-task Learning for Predicting Tomorrow's Mood, Stress, and Health
VHRED Cornell baseline
VHRED Cornell baseline
Influence agent in Harvest game
Influence agent in Harvest game
A3C baseline in Harvest
A3C baseline in Harvest
Agent trained with intrinsic social influence reward - Tragedy of the Commons
Agent trained with intrinsic social influence reward - Tragedy of the Commons
Agent trained with intrinsic social influence reward
Agent trained with intrinsic social influence reward
A3C will not free other agent trapped in a box
A3C will not free other agent trapped in a box
Influence agent frees compatriot trapped in a box
Influence agent frees compatriot trapped in a box
Note RNN
Note RNN
Q
Q
G
G
Basic LSTM
Basic LSTM
Psi
Psi
RL Tuner
RL Tuner
EDAExplorer PeakTutorial
EDAExplorer PeakTutorial
EDAExplorer ArtifactTutorial
EDAExplorer ArtifactTutorial
The Challenge
The Challenge
  • О нас
  • Контакты
  • Отказ от ответственности - Disclaimer
  • Условия использования сайта - TOS
  • Политика конфиденциальности

video2dn Copyright © 2023 - 2025

Контакты для правообладателей [email protected]