Fine tuning Pixtral - Multi-modal Vision and Text Model

Описание к видео Fine tuning Pixtral - Multi-modal Vision and Text Model

➡️ Get Life-time Access to the Complete Scripts (and future improvements): https://Trelis.com/ADVANCED-vision/
➡️ One-click fine-tuning and LLM templates: https://github.com/TrelisResearch/one...
➡️ Newsletter: https://blog.Trelis.com
➡️ Resources/Support/Discord: https://Trelis.com/About
➡️ Thumbnail made with this tutorial:    • Fine Tune Flux Diffusion Models with ...  

VIDEO RESOURCES:
Slides: https://docs.google.com/presentation/...
Chess Dataset: https://huggingface.co/datasets/Treli...
Pixtral model on HF: https://huggingface.co/mistralai/Pixt...
Pixtral community/transformers model + discussion: https://huggingface.co/mistral-commun...
vLLM docs: https://docs.vllm.ai/en/latest/gettin...
Thumbnail created using Flux Schnell in this video:    • Fine Tune Flux Diffusion Models with ...  

TIMESTAMPS:
0:00 How to fine-tune Pixtral.
0:43 Video Overview
1:27 Pixtral architecture and design choices
3:51 Mistral’s custom image encoder - trained from scratch
8:35 Fine-tuning Pixtral in a Jupyter notebook
9:33 GPU setup for notebook fine-tuning and VRAM requirements
12:23 Getting a “transformers” version of Pixtral for fine-tuning
15:00 Loading Pixtral
16:21 Dataset loading and preparation
18:08 Chat templating (somewhat advanced, but recommended)
23:33 Inspecting and evaluating baseline performance on the custom data
26:34 Setting up data collation (including for multi-turn training).
31:09 Training on completions only (tricky but improves performance)
35:08 Setting up LoRA fine-tuning
41:04 Setting up training arguments (batch size, learning rate, gradient checkpointing)
43:36 Setting up tensor board
46:48 Evaluating the trained model
47:46 Merging LoRA adapters and pushing the model to hub
49:07 Measuring performance on OCR (optical character recognition)
50:28 Inferencing Pixtral with vLLM, setting up an API endpoint
55:17 Video resources

Комментарии

Информация по комментариям в разработке