The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Описание к видео The Mamba in the Llama: Distilling and Accelerating Hybrid Models

How to find the Mamba in your Llama (and make it fast).

Work led by Junxiong Wang and Daniele Paliotta with Avner May and Tri Dao advising.

Arxiv Paper: https://arxiv.org/abs/2408.15237
Code: https://github.com/jxiw/MambaInLlama
Tutorial (on Mamba):    • Do we need Attention? A Mamba Primer  

Also check out
"Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models": https://arxiv.org/abs/2408.10189
Rene: https://cartesia.ai/blog/2024-08-27-o...

Комментарии

Информация по комментариям в разработке