Optimizing Model Deployments with Triton Model Analyzer

Описание к видео Optimizing Model Deployments with Triton Model Analyzer

How do you identify the batch size and number of model instances for the optimal inference performance? Triton Model Analyzer is an offline tool that can be used to evaluate 100’s of configurations to meet the latency, throughput & memory requirements of your application.

Get started with model analyzer here: https://github.com/triton-inference-s...

#Triton #Inference #ModelAnalyzer #AI

Комментарии

Информация по комментариям в разработке