Visually explaining Byte Latent Transformers - LLMs just got a massive breakthrough!

Описание к видео Visually explaining Byte Latent Transformers - LLMs just got a massive breakthrough!

In this video, we discuss Meta's latest paper on the Byte Latent Transformers (BLT) model from the paper Byte Latent Transformers - Patches scale better than Tokens. Quite literally, we go over each word in that sentence, and what they mean. Personally, I think dynamic compute allocation is a huge deal and this feels like a pretty exciting research direction for LLMs going forward. I tried to present visually engaging material that explains the architectural design behind various ideas in the paper.

Paper link: Paper - https://arxiv.org/abs/2412.09871

#deeplearning #ai

Join our Patreon to support the channel! Your support keeps the channel going! Members also get access to all the code, slides, documents, animations produced in all my videos including this one. Files are usually shared within a day of upload.

Patreon link:   / neuralbreakdownwithavb  
Direct link for the material used in this video:   / byte-latent-blt-118825972  

Related videos you may enjoy:
Transformers playlist:    • Attention to Transformers from First ...  

The History of Attention:    • Turns out Attention wasn't all we nee...  
Coding Language Models from scratch:    • From Attention to Generative Language...  
Latent Space Models:    • Visualizing the Latent Space: This vi...  
Advanced Latent Space LLMs:    • If LLMs are text models, how do they ...  
History of NLP:    • 10 years of NLP history explained in ...  


Timestamps:
0:00 - Intro
1:21 - Intro to Transformers
3:39 - Subword Tokenizers
4:48 - Embeddings
7:10 - How does vocab size impact Transformer FLOPs?
11:15 - Byte Encodings
12:33 - Pros and Cons of Byte Tokens
15:05 - Patches
17:00 - Entropy
19:34 - Entropy model
23:40 - Dynamically Allocate Compute
25:11 - Latent Space
27:15 - BLT Architecture
29:30 - Local Encoder
34:06 - Latent Transformer and Local Decoder in BLT
36:08 - Outro

Комментарии

Информация по комментариям в разработке