Multimodal Embeddings: Introduction & Use Cases (with Python)

Описание к видео Multimodal Embeddings: Introduction & Use Cases (with Python)

🗞️ Get exclusive access to AI resources and project ideas: https://the-data-entrepreneurs.kit.co...
🧑‍🎓 Learn AI in 6 weeks by building it: https://maven.com/shaw-talebi/ai-buil...
--
Multimodal embeddings represent multiple data modalities in the same vector space. Here, I discuss how they are developed and two example use cases: 0-shot classification and image search.

Resources:
📰 Blog: https://medium.com/towards-data-scien...
💻 GitHub Repo: https://github.com/ShawhinT/YouTube-B...

References:
[1] BERT: https://arxiv.org/abs/1810.04805
[2] ViT: https://arxiv.org/abs/2010.11929
[3] CLIP: https://arxiv.org/abs/2103.00020
[4] Though2Text: https://arxiv.org/abs/2410.07507
[5] A Simple Framework for Contrastive Learning of Visual Representations: https://arxiv.org/abs/2002.05709

--
Homepage: https://www.shawhintalebi.com

Introduction - 0:00
What are embeddings? - 1:01
Multimodal Embeddings - 5:08
Contrastive Learning - 6:56
Contrastive Learning (Details) - 8:16
Example 1: 0-shot Image Classification - 15:17
Example 2: Image Search - 19:50
What's Next? - 22:47

Комментарии

Информация по комментариям в разработке