Mamba Might Just Make LLMs 1000x Cheaper...

Описание к видео Mamba Might Just Make LLMs 1000x Cheaper...

Check out HubSpot's ChatGPT at work bundle! https://clickhubspot.com/twc

Would mamba bring a revolution to LLMs and challenge the status quo? Or would it just be a cope that may not last in the long term? Looking at the trajectories right now, we might not need transformers if mamba can actually scale but attention is probably still here to stay.

check out my AI sites leaderboard: https://leaderboard.bycloud.ai/

Special thanks to
- LDJ https://x.com/ldjconfirmed
- Gifted Gummy Bee
for helping with this video!

Mamba: Linear-Time Sequence Modeling with Selective State Spaces
[Paper] https://arxiv.org/abs/2312.00752
[Code] https://github.com/state-spaces/mamba

Transformer: Attention Is All You Need
[Paper] https://arxiv.org/abs/1706.03762

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
[Paper] https://arxiv.org/abs/2401.09417
[Code] https://github.com/hustvl/Vim

Efficiently Modeling Long Sequences with Structured State Spaces
[Paper] https://arxiv.org/pdf/2111.00396.pdf

Flash Attention
[Paper] https://arxiv.org/abs/2205.14135

Flash Attention 2
[Paper] https://arxiv.org/abs/2307.08691

VMamba: Visual State Space Model
[Paper] https://arxiv.org/abs/2401.10166
[Code] https://github.com/MzeroMiko/VMamba

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
[Paper] https://arxiv.org/abs/2401.04081

MambaByte: Token-free Selective State Space Model
[Paper] https://arxiv.org/abs/2401.13660

Repeat After Me: Transformers are Better than State Space Models at Copying
[Paper] https://arxiv.org/abs/2402.01032


This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi


[Discord]   / discord  
[Twitter]   / bycloudai  
[Patreon]   / bycloud  

[Music] massobeats - midnight
[Profile & Banner Art]   / pygm7  
[Video Editor] @askejm, Lunie

Комментарии

Информация по комментариям в разработке