RoPE Rotary Position Embedding to 100K context length

Описание к видео RoPE Rotary Position Embedding to 100K context length

ROPE - Rotary Position Embedding explained in simple terms for calculating the self attention in Transformers with a relative position encoding for extended Context lengths of LLMs.

All rights w/ authors:
ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING (RoPE)
https://arxiv.org/pdf/2104.09864

#airesearch
#aiexplained

Комментарии

Информация по комментариям в разработке