DINO - DETR with Improved DeNoising AnchorBoxes for End-to-End Object Detection

Описание к видео DINO - DETR with Improved DeNoising AnchorBoxes for End-to-End Object Detection

This video talks about DINO - the first state-of-the-art, Detr-like, transformer based model.
This video is part of broader series: Modern Object Detection - from YOLO to Transformer    • Modern Object Detection: from YOLO to...  .
The model itself builds on top of the concepts introduced in Detr, Deformable Detr, DAB Detr and DN Detr, improving on them and remixing them to achieve superior quality under the same conditions (training time, parameter count, pretrain data size). One of the model variants also utilises huge backbone - Swin-L - and pretraining on Objects365 dataset to achieve SOTA accuracy on CoCo dataset.
Important links:
- Original paper: https://arxiv.org/pdf/2203.03605.pdf
- DINO source code: https://github.com/IDEA-Research/DINO
- My video about Detr, first model in the series:    • Object Detection with Transformers (D...  
- My video about Deformable Detr:    • Deformable DETR  
- My video about DAB Detr:    • DAB Detr (dynamic anchor boxes)  
- My video about DN Detr:    • DN Detr (Denoising Detr for object de...  

00:00 - Intro
02:30 - Previous Detr models overview
20:54 - Contrastive Denoising Loss
24:32 - Mixed Query Selection
26:59 - Look Forward Twice
30:20 - Objects 365 Dataset
32:54 - Results
37:22 - Next Up

Комментарии

Информация по комментариям в разработке