If LLMs are text models, how do they generate images? (Transformers + VQVAE explained)

videosharingcamera phonevideo phonefreeupload

Скачать If LLMs are text models, how do they generate images? (Transformers + VQVAE explained) бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно If LLMs are text models, how do they generate images? (Transformers + VQVAE explained) или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку If LLMs are text models, how do they generate images? (Transformers + VQVAE explained) бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео If LLMs are text models, how do they generate images? (Transformers + VQVAE explained)

In this video, I talk about Multimodal LLMs, Vector-Quantized Variational Autoencoders (VQ-VAEs), and how modern models like Google's Gemini, Parti, and OpenAI's DallE generate images together with text. I tried to cover a lot of bases starting from the very basics (latent space, autoencoders), all the way to more complex topics (like VQ-VAEs, codebooks, etc).

Follow on Twitter: @neural_avb
#ai #deeplearning #machinelearning

To support the channel and access the Word documents/slides/animations used in this video, consider JOINING the channel on Youtube or Patreon. Members get access to Code, project files, scripts, slides, animations, and illustrations for most of the videos on my channel! Learn more about perks below.

Join and support the channel - https://www.youtube.com/@avb_fj/join
Patreon -   / neuralbreakdownwithavb

Interesting videos/playlists:
Multimodal Deep Learning -    • Multimodal AI from First Principles -...
Variational Autoencoders and Latent Space -    • Visualizing the Latent Space: This vi...
From Neural Attention to Transformers -    • Attention to Transformers from First ...

Papers to read:
VAE - https://arxiv.org/abs/1312.6114
VQ-VAE - https://arxiv.org/abs/1711.00937
VQ-GAN - https://compvis.github.io/taming-tran...
Gemini - https://assets.bwbx.io/documents/user...
Parti - https://sites.research.google/parti/
DallE - https://arxiv.org/pdf/2102.12092.pdf

Timestamps:
0:00 - Intro
3:49 - Autoencoders
6:16 - Latent Spaces
9:50 - VQ-VAE
11:30 - Codebook Embeddings
14:40 - Multimodal LLMs generating images