Retrieval Augmented Generation: Using your data with LLMs
Sanket Sudake (संकेत ) (~tripples) |
2
Description:
Large Language Models (LLM) are great at answering questions based on which they have been trained, mostly content available in the public domain. However, most enterprises want to use the LLMs with their specific data. Methods like LLM finetuning, RAG, and fitting data into the prompt context are different ways to achieve this.
In this talk, we cover RAG and different aspects of it.
Outline of talk
- Understanding RAG and its relation with LLMs (3 min)
- Overview of Naive RAG and vector databases (5 min)
- Components of RAG solution and pipelines involved (7 min)
- Data processing
- Vector Embedding and Injection
- Query reconstruction based on Chat History
- Retrieval Augmented Generation Pipeline
- Practical RAG implementation with Langchain and demonstration (5min)
- Evaluating and optimizing RAG for the production environment (5 min)
- Evaluating RAG and metrics for tracking performance
- Quality of retrieved chunks
- Consideration for multitenancy and observability
- Q/A (5 min)
Takeways
- Understanding of RAG for using proprietary data with LLMs
- Practical ways for RAG implementation and corner cases
- Approach for taking RAG solution to production
Prerequisites:
Basic understanding of Large Language Models(LLMs)
Content URLs:
I presented the same talk at the Python Pune May 2024 meetup earlier, it was well received.
The following are resources,
Speaker Info:
I am a Principal Engineer at InfraCloud with an overall 10+ years of experience. My areas of interest are AI, Cloud, and Distributed Systems. I am also an open-source contributor and tech enthusiast, like to explore different technologies from Linux kernel to different cloud platforms. I am a maintainer of the FaaS opensource platform Fission and contributed to Openstack in past. In AI specifically, I have been dabbling with different AI implementations over the past year, a few of which are in production.
Speaker Links:
- Previous talks
- Python Pune May 2024
- Aws Community Day 2020
- Gophercon India 2018
- Kubernetes Pune Meetup
- Python Pune May 2024
- Github @sanketsudake
- Twitter @sanketsudake