Accelerating Transformers with Optimum Neuron, AWS Trainium and AWS Inferentia2

Описание к видео Accelerating Transformers with Optimum Neuron, AWS Trainium and AWS Inferentia2

In this video, I show you how to accelerate Transformer training and inference with the Hugging Face Optimum Neuron library, a hardware acceleration library dedicated to AWS Trainium and AWS Inferentia 2, two custom AI chips designed by AWS.

First, changing a single line of code, I show you how to train a Vision Transformer model on the food101 datasets (75K training images). On a trn1.32xlarge instance, the model trains in under a minute per epoch.

Then, I show you how to export a DistilBERT model from the hub to Inferentia2. Running a benchmark on a inf2.xlarge instance, we get over 2000 predictions per second and P99 1-millisecond latency!

⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️

Amazon EC2 Trn1: https://aws.amazon.com/ec2/instance-t...
Amazon EC2 Inf2: https://aws.amazon.com/ec2/instance-t...
Hugging Face Neuron AMI: https://aws.amazon.com/marketplace/pp...
Optimum Neuron documentation: https://huggingface.co/docs/optimum-n...
Optimum Neuron Github: https://github.com/huggingface/optimu...
Code: https://gitlab.com/juliensimon/huggin...

Комментарии

Информация по комментариям в разработке