Turing-NLG, DeepSpeed and the ZeRO optimizer

Описание к видео Turing-NLG, DeepSpeed and the ZeRO optimizer

Microsoft has trained a 17-billion parameter language model that achieves state-of-the-art perplexity. This video takes a look at the ZeRO optimizer that enabled this breakthrough. ZeRO allows you to do model- and data-parallelism without having huge cuts in training speed.

https://www.microsoft.com/en-us/resea...
https://www.microsoft.com/en-us/resea...
https://github.com/microsoft/DeepSpeed
https://arxiv.org/abs/1910.02054

Links:
YouTube:    / yannickilcher  
Twitter:   / ykilcher  
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher

Комментарии

Информация по комментариям в разработке