A Simple Introduction to Retrieval Augmented Generation (RAG)

Описание к видео A Simple Introduction to Retrieval Augmented Generation (RAG)

A Simple Introduction to RAG

Retrieval-Augmented Generation (RAG) is emerging as a key approach to integrating large language models (LLMs) into industrial applications. This year, several projects in our CITS5553 Data Science Capstone have been exploring how RAG and LLM technologies can solve challenges across diverse domains for social good.

While the field is evolving rapidly, the foundational concepts remain steady. To help our students understand these fundamentals, we’ve created a simple RAG tutorial as part of the CITS5553 course, and I’m excited to share it with the broader community. Hopefully, this will inspire new ideas and applications across a wider range of industries.

A few (potentially unpopular) opinions:

• Should we include tools like LangChain or LlamaIndex in RAG applications? In my view, they’re not yet mature enough for production environments.
• What is the essential component in a RAG application compared to traditional systems? My answer: a suitable vector database.

We’ve included two code demonstrations in the tutorial:

1. Chat with a Website: A fully OpenAI-based RAG solution (214 lines of code) featuring a web crawler, OpenAI’s text-embedding-ada-002 model, OpenAI Vector Store, and GPT-4o.
2. Chat with PDF Files: A fully local RAG solution (171 lines of code) using PyMuPDF for PDF processing, the e5-large model for embedding, Qdrant as the vector store, and Phi-3-mini-4k-instruct as the LLM.

Комментарии

Информация по комментариям в разработке