Learn how to build and deploy a RAG web application using Python, Streamlit and LangChain, so multiple users can chat with Documents, Websites and other custom data online.
Code: https://github.com/enricd/rag_llm_app
Blog: / program-a-rag-llm-chat-app-with-langchain-...
The RAG LLM Streamlit App: https://rag-llm-app.streamlit.app/
In this RAG LLM course, we will learn how to develop a Retrieval Augmented Generation (RAG) pipeline step by step, and how to integrate it into a Chat Web App, in Python, using LangChain and Streamlit.
As you probably already know, LLMs are trained on large amounts of public data up to a certain date. Any fact that is either not public, newer, or quite niche is essentially unknown to them. Although newer models tend to be better at recalling facts that were in the training set, they are still far from perfect. This can be a limiting factor for many tasks that, for one reason or another, require an LLM that has to know specific topics very precisely.
RAG consists of providing a source of custom information to our LLM chat pipeline. Before sending any question to the model, we automatically provide the most relevant fragments of context extracted from this database, so the model has precise details in the context itself next to our question. In this way, the model knows very precisely what we are talking about, where the information comes from, and we can easily update that information with almost no cost or need for a GPU. We can use any already available LLM, like GPT-4o from the OpenAI API (now or soon even o1 and o1-mini!), Claude 3.5 from the Anthropic API, or even open-source ones with the original weights in a cheap and efficient way as we are already doing. If a better model appears tomorrow, we can integrate it almost immediately into our RAG pipeline and take advantage of it without having to fine-tune any LLM again.
In summary, this is an AI coding tutorial on how to use the LangChain chains create_history_aware_retriever and create_retrieval_chain, and also create_stuff_documents_chain to retrieve data from a Chorma DB Vector Store, where we would have stored our custom data embeddings using the OpenAI Embeddings model. This data would have been loaded with some different LangChain document loaders, and splitted using RecursiveCharacterTextSplitter. What's more, you will see how to make use of the OpenAI API and the Anthropic API, to make requests and get answers from their Large Language Models.
💡 Make sure to follow me on Medium, YouTube and GitHub as in the next blog and video we will see how to deploy this app into Azure, using GPT-4o and GPT-4o mini through Azure OpenAI Service and adding SSO Authentication in front of our app, so only authorized users under our Azure subscription (for example, your work colleagues) can access to our app, no one else will spend our resources or steal our data!
Sections:
00:00 - Intro
2:08 - What is RAG and why it's better than Fine Tuning
7:34 - RAG in Python with LangChain step by step
19:00 - Integrating RAG into an LLM Chat web app
37:20 - Deploy the RAG web app online for free!
Subscribe to see more AI and ML programming related content! 🚀🚀
-------------------------------------------------------------
Kaggle: https://www.kaggle.com/edomingo
GitHub: https://github.com/enricd
Twitter: / mad_enrico
Linkedin: / e-domingo
Medium: / enricdomingo
Web: https://enricdomingo.com
#gpt #gpt4o #gpt4 #openai #promptengineering #langchain #o1 #openaio1 #chatgpto1 #openaistrawbery #chatgpt #openaiapi #python #streamlit #github #cloud #portfolio #agent #gpt #aiagents #automation #ai #streamlit #llm #copilot #chatgpt4o #omnichat #omnidata #howtochatgpt #github #git #vscode #gui #pythongui #stream #modelstream #streaming #llmstream #llmstreaming #openaistream #openaistreaming #rag #retrievalaugmentedgeneration #langchainclaude #anthropiclangchain #llamaindex #ollama #llamacpp
Информация по комментариям в разработке