Despoina Paschalidou: Compositional Representations for Understanding and Generating 3D Environments

Описание к видео Despoina Paschalidou: Compositional Representations for Understanding and Generating 3D Environments

This talk was held on March 24, 2022 as a part of the MLFL series, hosted by the Center for Data Science, UMass Amherst.

Abstract: Within the first year of our life, we develop a common-sense understanding of the physical behavior of the world, which relies heavily on our ability to properly reason about the arrangement of objects in a scene. While this seems to be a fairly easy task for the human brain, computer vision algorithms struggle to form such high-level reasoning. Therefore, the research community shifted their attention to the development of primitive-based methods that seek to represent objects as semantically consistent part arrangements. However, due to the simplicity of existing primitive representations, these methods fail to accurately reconstruct 3D shapes using a small number of primitives/parts.

In the first part of my talk, I will address the trade-off between reconstruction quality and number of parts and present Neural Parts, a novel 3D primitive representation that defines primitives using an Invertible Neural Network (INN) which implements homeomorphic mappings between a sphere and the target object. Since a homeomorphism does not impose any constraints on the primitive shape, our model effectively decouples geometric accuracy from parsimony and as a result captures complex geometries with an order of magnitude fewer primitives. In the second part of my talk, we will look into the problem of inferring and subsequently also generating semantically meaningful object arrangements to populate 3D scenes conditioned on the room shape. In particular, I will present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments as unordered sets of objects. Our unordered set formulation allows us to use the same trained model for a variety of interactive applications such general scene completion, partial room rearrangement with any objects specified by the user, as well as object suggestions for any partial room. This is an important step towards fully automatic content creation.


Bio: Despoina Paschalidou is a PostDoc at Stanford University working with Prof. Leo Guibas at the Geometric Computation Group. Prior to this, she did her PhD at the Max Planck Institute for Intelligent Systems in Tubingen and the Computer Vision Lab in ETH Zurich, under the guidance of Prof. Andreas Geiger and Prof. Luc van Gool. She received her Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki, in 2015. Her research interests revolve around semantic and interpretable representations of 3D objects and scenes. She spent 1 year working with Prof. Sanja Fidler at NVIDIA Research on developing interactive tools for content creation. Moreover, she spent 6 months at FAIR working with Prof. Andrea Vedaldi and David Novotny on unsupervised 3D reconstruction from video data. https://paschalidoud.github.io/

About Machine Learning and Friends Lunch: MLFL is a lively and interactive forum held weekly where friends of the UMass Amherst machine learning community can sit down, have lunch, and give or hear a 50-minute presentation on recent machine learning research. This semester of the UMass MLFL series has been graciously sponsored by our friends at Oracle Labs.

Please follow this link to know more about the past and upcoming talks: http://umass-mlfl.github.io

Комментарии

Информация по комментариям в разработке