PaliGemma by Google: Train Model on Custom Detection Dataset

Описание к видео PaliGemma by Google: Train Model on Custom Detection Dataset

Learn how to fine-tune PaliGemma, Google's open-source Vision-Language Model, for custom object detection tasks. This step-by-step tutorial walks you through modifying Google's notebook to train PaliGemma on your dataset. We'll use the handwritten digits and math operations dataset from RF100, explore the JSONL format, and demonstrate how to deploy your fine-tuned model for real-world inference. Discover the power of PaliGemma for image captioning, VQA, and object detection, and overcome its limitations.

Chapters:

00:00 PaliGemma Capabilities
02:03 Environment Setup
05:25 Dataset Format
09:07 Downloading Pre-trained Model
11:27 Loading Dataset
13:45 Training and Evaluating the Model
15:19 Deploying the Model
17:37 Important Considerations
20:02 Outro

Resources:

Roboflow: https://roboflow.com

🔴 Community Session June 6th, 2024 at 08:00 AM PST / 11:00 AM EST / 05:00 PM CET: https://roboflow.stream

⭐ Notebooks GitHub: https://github.com/roboflow/notebooks
⭐ Supervision GitHub: https://github.com/roboflow/supervision

📓 PaliGemma notebook: https://colab.research.google.com/git...

🗞 Gemma arXiv paper: https://arxiv.org/pdf/2403.08295
🗞 SigLIP arXiv paper: https://arxiv.org/pdf/2303.15343
🗞 PaliGemma blog post: https://blog.roboflow.com/how-to-fine...

🔗 RF100: https://www.rf100.org
🔗 PaliGemma model card: https://www.kaggle.com/models/google/...
🔗 PaliGemma fine-tuned checkpoints: https://huggingface.co/collections/go...
🔗 PaliGemma HF Space: https://huggingface.co/spaces/big-vis...

Stay updated with the projects I'm working on at https://github.com/roboflow and https://github.com/SkalskiP! ⭐

Комментарии

Информация по комментариям в разработке