Скачать или смотреть [FPGA 2022] An FPGA-based RNN-T Inference Accelerator with PIM-HBM

[FPGA 2022] An FPGA-based RNN-T Inference Accelerator with PIM-HBM

Скачать [FPGA 2022] An FPGA-based RNN-T Inference Accelerator with PIM-HBM бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно [FPGA 2022] An FPGA-based RNN-T Inference Accelerator with PIM-HBM или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Информация по загрузке:

Cкачать музыку [FPGA 2022] An FPGA-based RNN-T Inference Accelerator with PIM-HBM бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео [FPGA 2022] An FPGA-based RNN-T Inference Accelerator with PIM-HBM

Shinhaeng Kang, Samsung Electronics
Sukhan Lee, Samsung Electronics
Byeongho Kim, Seoul National University
Hweesoo Kim, Samsung Electronics
Kyomin Sohn, Samsung Electronics
Nam Sung Kim, Samsung Electronics
Eojin Lee, Inha University

In this paper, we implemented a world-first RNN-T inference accelerator using FPGA with PIM-HBM that can multiply the internal bandwidth of the memory. The accelerator offloads matrix-vector multiplication (GEMV) operations of LSTM layers in RNN-T into PIM-HBM, and PIM-HBM reduces the execution time of GEMV significantly by exploiting HBM internal bandwidth. To ensure that the memory commands are issued in a pre-defined order, which is one of the most important constraints in exploiting PIM-HBM, we implement a direct memory access (DMA) module and change configuration of the on-chip memory controller by utilizing the flexibility and reconfigurability of the FPGA. In addition, we design the other hardware modules for acceleration such as non-linear functions (i.e., sigmoid and hyperbolic tangent), element-wise operation, and ReLU module, to operate these compute-bound RNN-T operations on FPGA. For this, we prepare FP16 quantized weight and MLPerf input datasets, and modify the PCIe device driver and C++ based control codes. On our evaluation, our accelerator with PIM-HBM reduces the execution time of RNN-T by 2.5 × on average with 11.09% reduced LUT size and improves energy efficiency up to 2.6 × compared to the baseline.

ACM Digital Library: https://dl.acm.org/doi/10.1145/349042...

Created with Midspace: https://midspace.app/

Комментарии

Информация по комментариям в разработке