Can Whisper be used for real-time streaming ASR?

Скачать Can Whisper be used for real-time streaming ASR? бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно Can Whisper be used for real-time streaming ASR? или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку Can Whisper be used for real-time streaming ASR? бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео Can Whisper be used for real-time streaming ASR?

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io

Whisper is a robust Automatic Speech Recognition (ASR) model by OpenAI, but can it handle real-time streaming ASR where the latency requirement is several seconds? This is actually not too difficult, using the open-source whisper-streaming project, which turns Whisper into a streaming ASR system. It works by feeding longer and longer audio buffers into the Whisper model, using the LocalAgreement algorithm to confirm output as soon as it is agreed upon in two iterations, and then scrolls the buffer forward until the start of the next sentence.

0:00 - Introduction
0:35 - Batch vs Streaming ASR
1:55 - Why is this difficult?
2:58 - Whisper-streaming demo
3:38 - Processing consecutive audio buffers
4:36 - Confirming tokens with LocalAgreement
6:05 - Prompting previous context
7:01 - Limitations vs other streaming ASR models

References:

https://github.com/ufal/whisper_strea...

Macháček, Dominik, Raj Dabre, and Ondřej Bojar. "Turning Whisper into Real-Time Transcription System." IJCNLP-AACL 2023.

Chen, Xie, et al. "Developing real-time streaming transformer transducer for speech recognition on large-scale dataset." ICASSP 2021.