PATCH EMBEDDING | Vision Transformers explained

Описание к видео PATCH EMBEDDING | Vision Transformers explained

I will cover Vision transformer in three parts. The first part which is this video focusses on patch embedding in vision transformer.
I will go over all the details and explain everything happening inside the patch embedding in VIT in detail.
I will also go over how an implementation of patch embedding for vision transformer in Pytorch would look like.

The second part which goes through attention can be found here -
Attention in Vision Transformer (Part Two) -    • ATTENTION | An Image is Worth 16x16 W...  
The third part which builds entire transformer and shows how to visualize attention maps and positional embeddings can be found below -
Implementing Vision Transformer (Part Three) -    • Image Classification Using Vision Tra...  

Timestamps :
00:00 Intro
00:56 Need for Patch Embedding in Vision Transformer
01:30 Converting Image into Sequence of Patches
01:59 Patch Embedding Projection
02:45 Positional Information for Patches
03:40 CLS Token
04:10 Patch Embedding Responsibilities
04:40 Patch Embedding Module Implementation
08:02 Outro


Paper Link - https://tinyurl.com/exai-vit-paper
Implementation will be pushed here after all three videos are out - https://tinyurl.com/exai-vit-code
Subscribe - https://tinyurl.com/exai-channel-link

Background Track - Fruits of Life by Jimena Contreras
Email - [email protected]

Комментарии

Информация по комментариям в разработке