Mamba 2 - Transformers are SSMs: Generalized Models and Efficient Algorithms Through SSS Duality

Описание к видео Mamba 2 - Transformers are SSMs: Generalized Models and Efficient Algorithms Through SSS Duality

Paper here: https://arxiv.org/abs/2405.21060
Code!: https://github.com/state-spaces/mamba...

Notes: https://drive.google.com/file/d/1--XG...

00:00 Intro
01:45 SSMs
08:00 Quadratic form of an SSM
15:02 Expanded form of an SSM
24:00 Attention - it's all you need??
29:55 Kernel attention
32:50 Linear attention
34:32 Relating attention to SSMs
38:35 Defining the M matrix
43:48 Splitting the M matrix
46:30 Off diagonal decomposition
54:00 Recurrent form of the off diagonal
1:03:30 Combining the M matrix blocks and code
1:06:22 Complexity and other analysis

Комментарии

Информация по комментариям в разработке