SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation

Скачать SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation

Wenwu Wang, Professor in Signal Processing and Machine Learning, and a Co-Director of the Machine Audition Lab within the Centre for Vision Speech and Signal Processing, University of Surrey, UK, presents his work on audio-text learning for automated audio captioning and generation at the SANE 2023 workshop at New York University, New York, on October 26, 2023.
More info on the SANE workshop series: http://www.saneworkshop.org/

Abstract: Cross modal generation of audio and text has emerged as an important research area in audio signal processing and natural language processing. Audio-to-text generation, also known as automated audio captioning, aims to provide a meaningful language description of the audio content for an audio clip. This can be used for assisting the hearing-impaired to understand environmental sounds, facilitating retrieval of multimedia content, and analyzing sounds for security surveillance. Text-to-audio generation aims to produce an audio clip based on a text prompt which is a language description of the audio content to be generated. This can be used as sound synthesis tools for film making, game design, virtual reality/metaverse, digital media, and digital assistants for text understanding by the visually impaired. To achieve cross modal audio-text generation, it is essential to comprehend the audio events and scenes within an audio clip, as well as interpret the textual information presented in natural language. Additionally, learning the mapping and alignment of these two streams of information is crucial. Exciting developments have recently emerged in the field of automated audio-text cross modal generation. In this talk, we will give an introduction of this field, including problem description, potential applications, datasets, open challenges, recent technical progresses, and possible future research directions.

Комментарии

Информация по комментариям в разработке

SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation

Скачать SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation бесплатно в качестве 4к (2к / 1080p)

Cкачать музыку SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation бесплатно в формате MP3:

Описание к видео SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation

Похожие видео