Logo video2dn
  • Сохранить видео с ютуба
  • Категории
    • Музыка
    • Кино и Анимация
    • Автомобили
    • Животные
    • Спорт
    • Путешествия
    • Игры
    • Люди и Блоги
    • Юмор
    • Развлечения
    • Новости и Политика
    • Howto и Стиль
    • Diy своими руками
    • Образование
    • Наука и Технологии
    • Некоммерческие Организации
  • О сайте

Скачать или смотреть SNAPP Seminar || Chang-Han Rhee (Northwestern University) || February 21, 2022

  • SNAPP Seminar
  • 2022-02-21
  • 556
SNAPP Seminar  || Chang-Han Rhee (Northwestern University) ||  February 21, 2022
  • ok logo

Скачать SNAPP Seminar || Chang-Han Rhee (Northwestern University) || February 21, 2022 бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно SNAPP Seminar || Chang-Han Rhee (Northwestern University) || February 21, 2022 или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

  • Информация по загрузке:

Cкачать музыку SNAPP Seminar || Chang-Han Rhee (Northwestern University) || February 21, 2022 бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео SNAPP Seminar || Chang-Han Rhee (Northwestern University) || February 21, 2022

SNAPP Webpage: https://sites.google.com/view/snappse...

Speaker: Chang-Han Rhee (Northwestern University),
February 21, Mon, 11:30 am US Eastern Time

Title: Eliminating Sharp Minima from SGD with Truncated Heavy-Tailed Noise

Abstract: The empirical success of deep learning is often attributed to SGD’s mysterious ability to avoid sharp local minima in the loss landscape, as sharp minima are known to lead to poor generalization. Recently, empirical evidence of heavy-tailed gradient noise was reported in many deep learning tasks, and it was argued that SGD can escape sharp local minima under the presence of such heavy-tailed gradient noise, providing a partial explanation to the mystery. This talk analyzes a popular variant of SGD where gradients are truncated above a fixed threshold. We show that it achieves a stronger notion of avoiding sharp minima: it can effectively eliminate sharp local minima entirely from its training trajectory. Further, we rigorously characterize the first exit times from local minima and prove that under some structural conditions, the dynamics of heavy-tailed truncated SGD with small learning rates closely resemble those of a continuous-time Markov chain that never visits any sharp minima. Real data experiments on deep neural networks confirm our theoretical prediction that SGD with truncated heavy-tailed gradient noise finds flatter local minima and achieves better generalization.

This talk is based on the joint work with Xingyu Wang and Sewoong Oh.

Bio: Chang-Han Rhee is an Assistant Professor in Industrial Engineering and Management Sciences at Northwestern University. Before joining Northwestern University, he was a postdoctoral researcher at Centrum Wiskunde & Informatica and Georgia Tech. He received his Ph.D. in Computational and Mathematical Engineering from Stanford University. His research interests include applied probability, stochastic simulation, and machine learning. He was a winner of the Outstanding Publication Award from the INFORMS Simulation Society in 2016, winner of the Best Student Paper Award at the 2012 Winter Simulation Conference, and finalist of 2013 INFORMS George Nicholson Student Paper Competition.

Комментарии

Информация по комментариям в разработке

Похожие видео

  • О нас
  • Контакты
  • Отказ от ответственности - Disclaimer
  • Условия использования сайта - TOS
  • Политика конфиденциальности

video2dn Copyright © 2023 - 2025

Контакты для правообладателей [email protected]