Coding Llama 3 from scratch in PyTorch - Part 1

Описание к видео Coding Llama 3 from scratch in PyTorch - Part 1

In this video series, you will learn how to train and fine-tune Llama 3 model from scratch.

The goal is to code LLaMA 3 from scratch in PyTorch to create models with sizes 3B, 6B, 35B and 45BM params. In this first video, you'll learn about upcycling, downcycling and infini-attention.

📚Papers:
- Sparse Upcycling Training Mixture-of-Experts from Dense Checkpoints
: https://arxiv.org/abs/2212.05055
- Pre-training Small Base LMs with Fewer Tokens: https://arxiv.org/abs/2404.08634
Leave No Context Behind Efficient Infinite Context Transformers with Infini-attention: https://arxiv.org/abs/2404.07143


💻 To follow along you can use this colab notebook:
- https://github.com/Blaizzy/Coding-LLM...

🎥 Coding Llama 2 from scratch video series
Part 1: https://youtube.com/live/XHmag4damTg
Part 2: https://youtube.com/live/LSWDpFmbE90
Part 3:    • Coding Llama 2 from scratch in PyTorc...  

Комментарии

Информация по комментариям в разработке