Apply Second-Order Pruning Algorithms for SOTA Model Compression

Описание к видео Apply Second-Order Pruning Algorithms for SOTA Model Compression

Second-order pruning methods enable higher sparsity while maintaining accuracy by removing weights that directly affect the loss function the least. The end result is a sparse model with much smaller files, lower latency, and higher throughput. For example, using second-order pruning algorithms, a ResNet-50 image classification model can be pruned 95% while maintaining 99% of the baseline accuracy, decreasing the size of the file from the original 90.8MB to 9.3MB

In this video, we walk through the research, production results, and intuition for how second-order pruning algorithms work. We run through how to apply second-order pruning algorithms for SOTA model compression to your current ML projects.

Speaker: Eldar Kurtić, Research Consultant, Neural Magic

If you have any questions, join us in the Neural Magic Slack community: https://join.slack.com/t/discuss-neur...

Комментарии

Информация по комментариям в разработке