Physics of Language Models: Part 3.1 + 3.2, Knowledge Storage, Extraction and Manipulation

language modelreversal curseGPT4factual knowledgeknowledge extractionknowledge manipulationphysics of language models

Скачать Physics of Language Models: Part 3.1 + 3.2, Knowledge Storage, Extraction and Manipulation бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно Physics of Language Models: Part 3.1 + 3.2, Knowledge Storage, Extraction and Manipulation или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку Physics of Language Models: Part 3.1 + 3.2, Knowledge Storage, Extraction and Manipulation бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео Physics of Language Models: Part 3.1 + 3.2, Knowledge Storage, Extraction and Manipulation

Timecodes
0:00 - Prelude
6:59 - Toy Example and Motivation
12:07 - Definitions
16:07 - Result 1: Mixed Training
21:38 - Result 2: Pretrain and Finetune
23:37 - Result 3: Knowledge Augmentation
28:21 - Result 4: P-Probing
33:29 - Result 5: Q-Probing
36:25 - Result 6: Celebrity can help Minority
41:00 - Result 7: Bidirectional Model + MLM
46:02 - Start of Knowledge Manipulation
46:57 - Result 8: Knowledge Partial/Dual Retrieval
51:47 - Result 9: Knowledge Classification and Comparison
1:04:44 - Result 10: Knowledge Inverse Search (Reversal Curse)
1:15:37 - Conclusion

This is an expanded version of the talk I gave about the following two papers.

(Results 1-7)
Even if LLMs losslessly memorize the pretraining data, it may not be finetuned to extract knowledge from it. Probing techniques suggest that data augmentation is necessary on the pretrain level, regardless of model size, train time and finetune choices. https://arxiv.org/abs/2309.14316

(Results 8-10)
Why do LLMs need Chain of Thoughts even for basic questions (e.g. was Biden born on an even day)? We show that LLMs cannot efficiently manipulate knowledge even if such knowledge is 100% extractable; plus, inverse knowledge search is just impossible (a.k.a. reversal curse). https://arxiv.org/abs/2309.14402