Vision Transformer and its Applications

Описание к видео Vision Transformer and its Applications

Vision transformer is a recent breakthrough in the area of computer vision. While transformer-based models have dominated the field of natural language processing since 2017, CNN-based models are still demonstrating state-of-the-art performances in vision problems. Last year, a group of researchers from Google figured out how to make a transformer work on recognition. They called it "vision transformer". The follow-up works by the community demonstrated superior performance of vision transformers not only in recognition but also in other downstream tasks such as detection, segmentation, multi-modal learning and scene text recognition to mention a few.
In this talk, Rowel Atienza will go into a deeper understanding of the model architecture of vision transformers. Most importantly, Rowel will focus on the concept of self-attention and its role in vision. Then, he will present different model implementations utilizing the vision transformer as the main backbone.
Since self-attention can be applied beyond transformers, Rowel Atienza will also discuss a promising direction in building general-purpose model architectures. In particular, networks that can process a variety of data formats such as text, audio, image and video.

→ To watch more videos like this, visit https://aiplus.training ​←
Do You Like This Video? Share Your Thoughts in Comments Below
Also, You can visit our website and choose the nearest ODSC Event to attend and experience all our Trainings and Workshops:
https://odsc.com/california/
https://odsc.com/apac/
Sign up for the newsletter to stay up to date with the latest trends in data science: https://opendatascience.com/newsletter/
Follow Us Online!
• Facebook:   / opendatasci  
• Instagram:   / odsc  
• Blog: https://opendatascience.com/
• LinkedIn:   / open-data-science  
• Twitter:   / odsc  

Комментарии

Информация по комментариям в разработке