Retrieval Augmented Generation: Using your data with LLMs

Sanket Sudake (संकेत ) (~tripples) | 30 May, 2024

2

Votes

Description:

Large Language Models (LLM) are great at answering questions based on which they have been trained, mostly content available in the public domain. However, most enterprises want to use the LLMs with their specific data. Methods like LLM finetuning, RAG, and fitting data into the prompt context are different ways to achieve this.

In this talk, we cover RAG and different aspects of it.

Outline of talk

Understanding RAG and its relation with LLMs (3 min)
Overview of Naive RAG and vector databases (5 min)
Components of RAG solution and pipelines involved (7 min)
- Data processing
- Vector Embedding and Injection
- Query reconstruction based on Chat History
- Retrieval Augmented Generation Pipeline
Practical RAG implementation with Langchain and demonstration (5min)
Evaluating and optimizing RAG for the production environment (5 min)
- Evaluating RAG and metrics for tracking performance
- Quality of retrieved chunks
- Consideration for multitenancy and observability
Q/A (5 min)

Takeways

Understanding of RAG for using proprietary data with LLMs
Practical ways for RAG implementation and corner cases
Approach for taking RAG solution to production

Prerequisites:

Basic understanding of Large Language Models(LLMs)

Content URLs:

I presented the same talk at the Python Pune May 2024 meetup earlier, it was well received.

The following are resources,

Speaker Info:

Sanket Sudake

I am a Principal Engineer at InfraCloud with an overall 10+ years of experience. My areas of interest are AI, Cloud, and Distributed Systems. I am also an open-source contributor and tech enthusiast, like to explore different technologies from Linux kernel to different cloud platforms. I am a maintainer of the FaaS opensource platform Fission and contributed to Openstack in past. In AI specifically, I have been dabbling with different AI implementations over the past year, a few of which are in production.

Speaker Links:

Previous talks
- Python Pune May 2024
  - Aws Community Day 2020
- Gophercon India 2018
- Kubernetes Pune Meetup
Github @sanketsudake
Twitter @sanketsudake

Section:	Artificial Intelligence and Machine Learning
Type:	Talk
Target Audience:	Intermediate
Last Updated:	30 May, 2024

Comments