ISCA 2024: TCP -- Tensor Contraction Processor for AI Workloads

Описание к видео ISCA 2024: TCP -- Tensor Contraction Processor for AI Workloads

FuriosaAI CTO Hanjoon Kim presents “TCP: A Tensor Contraction Processor for AI Workloads" at ISCA 2024, the annual International Symposium on Computer Architecture.

The Tensor Contraction Processor (TCP) is a novel chip architecture that delivers several noteworthy technical innovations which make it easier to program and optimize, while also enabling greater data reuse and energy efficiency.

TCP is the underlying architecture used in Furiosa's second-gen chip, RNGD, which is designed to accelerate inference with a wide range of models -- in particular, LLMs and multi-modal models. RNGD (pronounced "Renegade") is designed with a 150W TDP and utilizes 48GB of the latest HBM3 memory.
RNGD supports BF16 to directly handle floating-point models, and it also provides precision options for quantization, such as INT8/INT4 and FP8.

Learn more and sign up for updates: https://furiosa.ai/
Paper: http://furiosa.ai/download/FuriosaAI-...
Blog post: https://furiosa.ai/blog/tensor-contra...

Комментарии

Информация по комментариям в разработке