Optimization Tricks: momentum, batch-norm, and more

Описание к видео Optimization Tricks: momentum, batch-norm, and more

Deep Learning Crash Course playlist:    • Deep Learning Crash Course  

How to Design a Convolutional Neural Network
   • How to Design a Convolutional Neural ...  

Highlights:
Stochastic Gradient Descent
Momentum Algorithm
Learning Rate Schedules
Adaptive Methods: AdaGrad, RMSProp, and Adam
Internal Covariate Shift
Batch Normalization
Weight Initialization
Local Minima
Saddle Points

References and further reading:

Deep Learning by Ian Goodfellow:
http://www.deeplearningbook.org/

Stochastic gradient descent
https://en.wikipedia.org/wiki/Stochas...

Adaptive subgradient methods for online learning and stochastic optimization
http://jmlr.org/papers/volume12/duchi...

RMSProp Lecture Notes by Geoffrey Hinton
https://www.cs.toronto.edu/~tijmen/cs...

Adam: A method for stochastic optimization
https://arxiv.org/pdf/1412.6980

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
https://arxiv.org/pdf/1502.03167.pdf

Saddle point
https://en.wikipedia.org/wiki/Saddle_...

Understanding the difficulty of training deep feedforward neural networks
http://proceedings.mlr.press/v9/gloro...

#deeplearning #machinelearning

Комментарии

Информация по комментариям в разработке