DeepSeek V3 Best Open Source Huge Model Beats Claude Sonnet 3 5 in Coding TestedFreelancersGPT

Описание к видео DeepSeek V3 Best Open Source Huge Model Beats Claude Sonnet 3 5 in Coding TestedFreelancersGPT

DeepSeek V3 is so far the Best Open Source LLM, Tech Specs as following:

Training Hours:
The model pre-trained on 14.8 trillion "high-quality and diverse tokens

DeepSeek v3 trained on 2.78M H800 GPU hours at an estimated cost of $5.5M

Meta AI's Llama 3.1 405B (smaller than DeepSeek v3's 685B parameters) trained on 11x that - 30.8M GPU hours, also on 15 trillion tokens.

DeepSeek v3 benchmarks comparably to Claude 3.5 Sonnet, indicating that it's now possible to train a frontier-class model less than 6M

Training GPUs
Llama 3-405B trained on 100k GPUs
DeepSeek V3 trained on 2048 GPUs

"This level of capability is supposed to require clusters of closer to 16k GPUs" - Andrej Karpathy
https://x.com/karpathy/status/1872362...

Pricing:
DeepSeek V3 is 10x Cheaper than Sonnet
Input: $0.27/million tokens ($0.07/million tokens with cache hits)
Output: $1.10/million tokens

Claude 3.5 Sonnet is currently $3/million for input and $15/million for output


More Specs:
DeepSeek V3 is a Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token



Sources:

Paper:
https://github.com/deepseek-ai/DeepSe...

https://x.com/deepseek_ai/status/1872...




Kaggle notebooks:
Notebook is only created for demonstration and serve as a guidance for those who were interested using similar methods to build projects. It is NOT a free giveaway for a few reasons. 1. it works while the video is recorded, However it does not guarantee to work at a later date as tech communities make code changes all the time. Please follow the tutorial and create your own version of it if needed. 2. If you have any questions or need help, please join the discord server community to discuss or subscribe to the channel. 3. if you need further professional assistance, please feel to book a consulting call. Thanks for understanding

Комментарии

Информация по комментариям в разработке