COSIC seminar "FPT: a Fixed-Point Accelerator for Torus..." (Michiel Van Beirendonck, KU Leuven)

Описание к видео COSIC seminar "FPT: a Fixed-Point Accelerator for Torus..." (Michiel Van Beirendonck, KU Leuven)

COSIC seminar – FPT: a Fixed-Point Accelerator for Torus Fully Homomorphic Encryption – Michiel Van Beirendonck (KU Leuven)

Fully Homomorphic Encryption is a technique that allows computation on encrypted data. It has the potential to change privacy considerations in the cloud, but computational and memory overheads are preventing its adoption. TFHE is a promising Torus-based FHE scheme that relies on bootstrapping, the noise-removal tool invoked after each encrypted logical/arithmetical operation.

We present FPT, a Fixed-Point FPGA accelerator for TFHE bootstrapping. FPT is the first hardware accelerator to exploit the inherent noise present in FHE calculations. Instead of double or single-precision floating-point arithmetic, it implements TFHE bootstrapping entirely with approximate fixed-point arithmetic. Using an in-depth analysis of noise propagation in bootstrapping FFT computations, FPT is able to use noise-trimmed fixed-point representations that are up to 50% smaller than prior implementations.

FPT is built as a streaming processor inspired by traditional streaming DSPs: it instantiates directly cascaded high-throughput computational stages, with minimal control logic and routing networks. We explore throughput-balanced compositions of streaming kernels with a user-configurable streaming width in order to construct a full bootstrapping pipeline. Our approach allows 100% utilization of arithmetic units and requires only a small bootstrapping key cache, enabling an entirely compute-bound bootstrapping throughput of 1 BS / 35us. This is in stark contrast to the classical CPU approach to FHE bootstrapping acceleration, which is typically constrained by memory and bandwidth.

FPT is implemented and evaluated as a bootstrapping FPGA kernel for an Alveo U280 datacenter accelerator card. FPT achieves two to three orders of magnitude higher bootstrapping throughput than existing CPU-based implementations, and 2.5x higher throughput compared to recent ASIC emulation experiments.

Joint work with: Jan-Pieter D’Anvers, Furkan Turan, Ingrid Verbauwhede

Комментарии

Информация по комментариям в разработке