Stanford CS25: V4 I Behind the Scenes of LLM Pre-training: StarCoder Use Case

Описание к видео Stanford CS25: V4 I Behind the Scenes of LLM Pre-training: StarCoder Use Case

May 23, 2024
Speaker: Loubna Ben Allal, Hugging Face

As large language models (LLMs) become essential to many AI products, learning to pretrain and fine-tune them is now crucial. In this talk, we will explore the intricacies of training LLMs from scratch, including lessons on scaling laws and data curation. Then, we will study the StarCoder use case as an example of LLMs tailored for code, highlighting how their development differs from standard LLMs. Additionally, we will discuss important aspects of data governance and evaluation, crucial elements in today's conversations about LLMs and AI that are frequently overshadowed by the pre-training discussions.

About the speaker: Loubna Ben Allal is a Machine Learning Engineer in the Science team at Hugging Face working on Large Language Models for code & Synthetic data generation. She is part of the core team behind the BigCode Project and has co-authored The Stack dataset and StarCoder models for code generation. Loubna holds Mathematics & Deep Learning Master's Degrees from Ecole des Mines de Nancy and ENS Paris Saclay.

More about the course can be found here: https://web.stanford.edu/class/cs25/

View the entire CS25 Transformers United playlist:    • Stanford CS25 - Transformers United  

Комментарии

Информация по комментариям в разработке