PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers

Скачать PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers

This is a supplementary video for the paper, titled "PrefMMT: Modeling Human Preferences in Preference-based Reinforcement Learning with Multimodal Transformers", by Dezhong Zhao*, Ruiqi Wang*, Dayoon Suh, Taehyeon Kim, Ziqin Yuan, Byung-Cheol Min, and Guohua Chen (*：Equal Contribution).

Paper Website: https://sites.google.com/view/prefmmt

Abstract: Preference-based reinforcement learning (PbRL) shows promise in aligning robot behaviors with human preferences, but its success depends heavily on the accurate modeling of human preferences through reward models. Most methods adopt Markovian assumptions for preference modeling (PM), which overlook the temporal dependencies within robot behavior trajectories that impact human evaluations. While recent works have utilized sequence modeling to mitigate this by learning sequential non-Markovian rewards, they ignore the multimodal nature of robot trajectories, which consist of elements from two distinctive modalities: state and action. As a result, they often struggle to capture the complex interplay between these modalities that significantly shapes human preferences. In this paper, we propose a multimodal sequence modeling approach for PM by disentangling state and action modalities. We introduce a multimodal transformer network, named PrefMMT, which hierarchically leverages intra-modal temporal dependencies and inter-modal state-action interactions to capture complex preference patterns. We demonstrate that PrefMMT consistently outperforms state-of-the-art PM baselines on locomotion tasks from the D4RL benchmark and manipulation tasks from the Meta-World benchmark.

Комментарии

Информация по комментариям в разработке

PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers

Скачать PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers бесплатно в качестве 4к (2к / 1080p)

Cкачать музыку PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers бесплатно в формате MP3:

Описание к видео PrefMMT: Modeling Human Preferences in Preference-based RL with Multimodal Transformers

Похожие видео