Make your LLMs fully utilize the context (paper explained)

Описание к видео Make your LLMs fully utilize the context (paper explained)

Of late, there has been quite a lot of interest in increasing/improving the context of LLMs. One good example is the infini-attention paper from Google in early April. While it was more of tweaking the model architecture to improve the context, Microsoft came back with an answer in late April but with a data-driven solution. The paper is titled, "Make your LLM Fully Utilize the Context".

RELATED LINKS
This paper - https://arxiv.org/abs/2404.16811
Infinite Context Transformers - https://arxiv.org/abs/2404.07143

⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️
0:00 - Intro
0:36 - Lost in the Middle Challenge in Context
0:59 - Related work in Long Context LLMs
1:34 - Information Intensive Training (IN2 Training)
2:31 - Fine-grained Information awareness
3:42 - Integration and Reasoning of Information
4:36 - Mathematical Representation
5:09 - Trainnig setting/Details
6:17 - VArious Long Context Probing (VAL Probing)
6:33 - Needle in a Haystack for Long Context LLMs
9:20 - Experimental Results
10:20 - Quantitative Results
11:12 - Real-world data performance
12:22 - Summary and Extro

OUR KEY LINKS
YouTube:    / @aibites  
Twitter:   / ai_bites​  
Patreon:   / ai_bites​  
Github: https://github.com/ai-bites​

WHO AM I?
I am a Machine Learning researcher/practitioner who has seen the grind of academia and start-ups. I started my career as a software engineer 15 years ago. Because of my love for Mathematics (coupled with a glimmer of luck), I graduated with a Master's in Computer Vision and Robotics in 2016 when the now happening AI revolution had started. Life has changed for the better ever since.

#machinelearning #deeplearning #aibites

Комментарии

Информация по комментариям в разработке