RAG from the Ground Up with Python and Ollama

Описание к видео RAG from the Ground Up with Python and Ollama

Retrieval Augmented Generation (RAG) is the de facto technique for giving LLMs the ability to interact with any document or dataset, regardless of its size. Follow along as I cover how to parse and manipulate documents, explore how embeddings are used to describe abstract concepts, implement a simple yet powerful way to surface the most relevant parts of a document to a given query, and ultimately build a script that you can use to have a locally-hosted LLM engage your own documents.

Check out my other Ollama videos:    • Get Started with Ollama  

Links:
Code from video - https://decoder.sh/videos/rag-from-th...
Ollama Python library - https://github.com/ollama/ollama-python
Project Gutenberg - https://www.gutenberg.org
Nomic Embedding model (on ollama) - https://ollama.com/library/nomic-embe...
BGE Embedding model - https://huggingface.co/CompendiumLabs...
How to use a model from HF with Ollama -    • Importing Open Source Models to Ollama  
Cosine Similarity - https://blog.gopenai.com/rag-for-ever...

Timestamps:
00:00 - Intro
00:26 - Environment Setup
00:49 - Function review
01:50 - Source Document
02:18 - Starting the project
02:37 - parse_file()
04:35 - Understanding embeddings
05:40 - Implementing embeddings
07:01 - Timing embedding
07:35 - Caching embeddings
10:06 - Prompt embedding
10:19 - Cosine similarity for embedding comparison
12:16 - Brainstorming improvements
13:15 - Giving context to our LLM
14:29 - CLI input
14:49 - Next steps

Комментарии

Информация по комментариям в разработке