ConvNeXt: A ConvNet for the 2020s – Paper Explained (with animations)

Описание к видео ConvNeXt: A ConvNet for the 2020s – Paper Explained (with animations)

Can a ConvNet outperform a Vision Transformer? What kind of modifications do we have to apply to a ConvNet to make it as powerful as a Transformer? Spoiler: it’s not attention.
► SPONSOR: Weights & Biases 👉 https://wandb.me/ai-coffee-break

The official ConvNeXt repo has a W&B integration! Also, W&B built the CIFAR10 training colab linked there: 🥳   / 1486325233711828996  

❓ Check out our daily #MachineLearning Quiz Questions:    / aicoffeebreak  
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring....

Explained Paper 📜: Liu, Zhuang, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. “A ConvNet for the 2020s.” arXiv preprint arXiv:2201.03545 (2022). https://arxiv.org/abs/2201.03545
🔗 Tweet of Lukas Beyer (ViT author):   / 1481054929573888005  
🔗 Depthwise convolutions image and explanation: https://eli.thegreenplace.net/2018/de...

Referenced videos:
📺 An image is worth 16x16 words:    • An image is worth 16x16 words: ViT | ...  
📺 Swin Transformer:    • Swin Transformer paper animated and e...  
📺 This is how Transformers can process both image and text:    • Transformers can do both images and t...  
📺 ViLBERT explained:    • Transformer combining Vision and Lang...  
📺 DeiT explained:    • Data-efficient Image Transformers EXP...  
📺 Transformers sequence length:    • Do Transformers process sequences of ...  

Referenced papers:
📜 “Image Transformer” Paper: Parmar, Niki, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. “Image transformer.” In International Conference on Machine Learning, pp. 4055-4064. PMLR, 2018. https://arxiv.org/abs/1802.05751
📜 “ViLBERT“ paper: Lu, Jiasen, Dhruv Batra, Devi Parikh, and Stefan Lee. “Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks.” arXiv preprint arXiv:1908.02265 (2019). https://arxiv.org/abs/1908.02265

Outline:
00:00 A ConvNet for the 2020s
01:58 Weights & Biases (Sponsor)
03:10 Why bother?
04:40 The perks of ConvNets (CNNs)
06:51 Pros and cons of Transformers
09:54 From ConvNets to ConvNeXts
15:54 Lessons?

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
donor, Dres. Trost GbR, banana.dev

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon:   / aicoffeebreak  
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz:    / aicoffeebreak  
Twitter:   / aicoffeebreak  
Reddit:   / aicoffeebreak  
YouTube:    / aicoffeebreak  

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Комментарии

Информация по комментариям в разработке