Open Pretrained Transformers - Susan Zhang | Stanford MLSys #77

Описание к видео Open Pretrained Transformers - Susan Zhang | Stanford MLSys #77

Episode 77 of the Stanford MLSys Seminar “Foundation Models Limited Series”!

Speaker: Susan Zhang

Talk: Trials of developing OPT-175B

Abstract: LLM development at scale is an extraordinarily resource-intensive process, requiring compute resources that many do not have access to. The experimentation process will also appear rather haphazard in comparison, given limited compute-time to fully ablate all architectural / hyperparameter choices. In this talk, we will walk through the development lifecycle of OPT-175B, covering infrastructure and training convergence challenges faced at scale, along with methods of addressing these issues going forward.

Bio: Susan Zhang is a research engineer at Meta focused on the development of large-scale language models. Previously, she worked on designing photonic chips at Luminous Computing, scaling reinforcement learning systems at OpenAI, and building large-scale data infrastructure systems at Unity Technologies.

Check out our website for the schedule: http://mlsys.stanford.edu
Join our mailing list to get weekly updates: https://groups.google.com/forum/#!for...

Комментарии

Информация по комментариям в разработке