Скачать или смотреть TIP: Tabular-Image Pre-training for Multimodal Classfication with Incomplete Data

TIP: Tabular-Image Pre-training for Multimodal Classfication with Incomplete Data

Скачать TIP: Tabular-Image Pre-training for Multimodal Classfication with Incomplete Data бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно TIP: Tabular-Image Pre-training for Multimodal Classfication with Incomplete Data или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Информация по загрузке:

Cкачать музыку TIP: Tabular-Image Pre-training for Multimodal Classfication with Incomplete Data бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео TIP: Tabular-Image Pre-training for Multimodal Classfication with Incomplete Data

[ECCV 2024] Images and structured tables are essential parts of real-world databases. Though tabular-image representation learning is promising to create new insights, it remains a challenging task, as tabular data is typically heterogeneous and incomplete, presenting significant modality disparities with images. Earlier works have mainly focused on simple modality fusion strategies in complete data scenarios, without considering the missing data issue, and thus are limited in practice. In this paper, we propose TIP, a novel tabular-image pre-training framework for learning multimodal representations robust to incomplete tabular data. Specifically, TIP investigates a novel self-supervised learning (SSL) strategy, including a masked tabular reconstruction task for tackling data missingness, and image-tabular matching and contrastive learning objectives to capture multimodal information. Moreover, TIP proposes a versatile tabular encoder tailored for incomplete, heterogeneous tabular data and a multimodal interaction module for inter-modality representation learning. Experiments are performed on downstream multimodal classification tasks using both natural and medical image datasets. The results show that TIP outperforms state-of-the-art supervised/SSL image/multimodal algorithms in both complete and incomplete data scenarios. Our code is available at https://github.com/siyi-wind/TIP.

Комментарии

Информация по комментариям в разработке