SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation

Описание к видео SANE2023 | Wenwu Wang - Audio-Text Learning for Automated Audio Captioning and Generation

Wenwu Wang, Professor in Signal Processing and Machine Learning, and a Co-Director of the Machine Audition Lab within the Centre for Vision Speech and Signal Processing, University of Surrey, UK, presents his work on audio-text learning for automated audio captioning and generation at the SANE 2023 workshop at New York University, New York, on October 26, 2023.
More info on the SANE workshop series: http://www.saneworkshop.org/

Abstract: Cross modal generation of audio and text has emerged as an important research area in audio signal processing and natural language processing. Audio-to-text generation, also known as automated audio captioning, aims to provide a meaningful language description of the audio content for an audio clip. This can be used for assisting the hearing-impaired to understand environmental sounds, facilitating retrieval of multimedia content, and analyzing sounds for security surveillance. Text-to-audio generation aims to produce an audio clip based on a text prompt which is a language description of the audio content to be generated. This can be used as sound synthesis tools for film making, game design, virtual reality/metaverse, digital media, and digital assistants for text understanding by the visually impaired. To achieve cross modal audio-text generation, it is essential to comprehend the audio events and scenes within an audio clip, as well as interpret the textual information presented in natural language. Additionally, learning the mapping and alignment of these two streams of information is crucial. Exciting developments have recently emerged in the field of automated audio-text cross modal generation. In this talk, we will give an introduction of this field, including problem description, potential applications, datasets, open challenges, recent technical progresses, and possible future research directions.

Комментарии

Информация по комментариям в разработке