Panoptic Lifting (CVPR 2023) with Yawar Siddiqui on Talking papers

Описание к видео Panoptic Lifting (CVPR 2023) with Yawar Siddiqui on Talking papers

All links are available in the blog post: https://www.itzikbs.com/panoptic-lifting

Welcome to another exciting episode of the Talking Papers Podcast! In this installment, I had the pleasure of hosting Yawar Siddiqui to discuss his groundbreaking paper titled "Panoptic Lifting for 3D Scene Understanding with Neural Fields," which has taken the CVPR 2023 conference by storm.

In this paper, Yawar Siddiqui introduces a novel approach to tackle the challenges of 3D scene understanding by utilizing neural fields and real-world scene images. While the semantic segmentation aspect is addressed through a straightforward Multi-Layer Perceptron (MLP) model, the instance indices present a more complex problem when transitioning between frames. However, Yawar Siddiqui cleverly addresses this issue by employing a combination of the Hungarian algorithm and a set of custom losses, effectively resolving the tracking problem.

Yawar Siddiqui's remarkable research journey has captivated many, myself included, and it's truly awe-inspiring to witness the immense growth he has achieved over the years. Our conversation was a delightful exploration of his work, and I'm eagerly anticipating the exciting projects he has in store for us in the future.

Join us for this riveting episode as we delve into the world of "Panoptic Lifting for 3D Scene Understanding with Neural Fields" and gain invaluable insights from Yawar Siddiqui himself. Don't miss out on the opportunity to be part of this groundbreaking discussion - hit that play button now!

AUTHORS
Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulò, Norman Müller, Matthias Nießner, Angela Dai, Peter Kontschieder

ABSTRACT
We propose Panoptic Lifting, a novel approach for learning panoptic 3D volumetric representations from images of in-the-wild scenes. Once trained, our model can render color images together with 3D-consistent panoptic segmentation from novel viewpoints. Unlike existing approaches which use 3D input directly or indirectly, our method requires only machine-generated 2D panoptic segmentation masks inferred from a pre-trained network. Our core contribution is a panoptic lifting scheme based on a neural field representation that generates a unified and multi-view consistent, 3D panoptic representation of the scene. To account for inconsistencies of 2D instance identifiers across views, we solve a linear assignment with a cost based on the model's current predictions and the machine-generated segmentation masks, thus enabling us to lift 2D instances to 3D in a consistent way. We further propose and ablate contributions that make our method more robust to noisy, machine-generated labels, including test-time augmentations for confidence estimates, segment consistency loss, bounded segmentation fields, and gradient stopping. Experimental results validate our approach on the challenging Hypersim, Replica, and ScanNet datasets, improving by 8.4, 13.8, and 10.6% in scene-level PQ over state of the art.

RELATED PAPERS
📚NeRF
📚Mask2Former
📚Semantic NeRF

LINKS AND RESOURCES

📚 Paper: https://bit.ly/3JQTqsx
💻Project page: https://nihalsid.github.io/panoptic-l...
💻Code: https://github.com/nihalsid/panoptic-...

SPONSOR
This episode was sponsored by YOOM. YOOM is an Israeli startup dedicated to volumetric video creation. They were voted as the 2022 best start-up to work for by Dun’s 100.
Join their team that works on geometric deep learning research, implicit representations of 3D humans, NeRFs, and 3D/4D generative models.

Visit https://www.yoom.com/

For job opportunities with YOOM visit https://www.yoom.com/careers/



CONTACT

If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: [email protected]

🎧Subscribe on your favourite podcasting app: https://talking.papers.podcast.itzikb...

📧Subscribe to our mailing list: http://eepurl.com/hRznqb

🐦Follow us on Twitter:   / talking_papers  

🎥YouTube Channel: https://bit.ly/3eQOgwP

TIME STAMPS
---------------------
00:00 Panoptic Lifting
01:07 Authors
01:40 Abstract
02:35 Introduction
09:27 Contribution
10:30 Related work
12:21 Approach
30:39 Results
38:18 Conclusions and future work
41:40 What did reviewer 2 say?
44:05 Sponsor


#talkingpapers #CVPR2023 #PanopticLifting #NeRF #TensoRF #AI #Segmentation #DeepLearning #MachineLearning #research #artificialintelligence #podcasts #MachineLearning #research #artificialintelligence #podcasts

Комментарии

Информация по комментариям в разработке