Fix CUDA Out of Memory (OOM) in PyTorch! No GPU Upgrades

Описание к видео Fix CUDA Out of Memory (OOM) in PyTorch! No GPU Upgrades

Today I face the most dreaded words when I train an LLM: CUDA out-of-memory. But don’t worry—I’ve discovered three powerful solutions you can try before considering more expensive hardware upgrades. Wish you CUDAn’t run out of memory again.

00:16 Method 1: reduce the batch size
00:42 Gradient accumulation
01:04 Method 2: mixed precision training
01:28 FP32 vs FP16
02:55 Method 3: gradient checkpointing

If you are a geek like me, you can play with the code here lol: https://colab.research.google.com/dri...

References
1. Automatic mixed precision training in PyTorch: https://pytorch.org/docs/stable/amp.h...
2. Gradient checkpointing in PyTorch: https://pytorch.org/docs/stable/check...

Комментарии

Информация по комментариям в разработке