Retrieval Augmented Generation: Using your data with LLMs

Sanket Sudake (संकेत ) (~tripples)


2

Votes

Description:

Large Language Models (LLM) are great at answering questions based on which they have been trained, mostly content available in the public domain. However, most enterprises want to use the LLMs with their specific data. Methods like LLM finetuning, RAG, and fitting data into the prompt context are different ways to achieve this.

In this talk, we cover RAG and different aspects of it.

Outline of talk

  • Understanding RAG and its relation with LLMs (3 min)
  • Overview of Naive RAG and vector databases (5 min)
  • Components of RAG solution and pipelines involved (7 min)
    • Data processing
    • Vector Embedding and Injection
    • Query reconstruction based on Chat History
    • Retrieval Augmented Generation Pipeline
  • Practical RAG implementation with Langchain and demonstration (5min)
  • Evaluating and optimizing RAG for the production environment (5 min)
    • Evaluating RAG and metrics for tracking performance
    • Quality of retrieved chunks
    • Consideration for multitenancy and observability
  • Q/A (5 min)

Takeways

  • Understanding of RAG for using proprietary data with LLMs
  • Practical ways for RAG implementation and corner cases
  • Approach for taking RAG solution to production

Prerequisites:

Basic understanding of Large Language Models(LLMs)

Content URLs:

I presented the same talk at the Python Pune May 2024 meetup earlier, it was well received.

The following are resources,

Speaker Info:

Sanket Sudake

I am a Principal Engineer at InfraCloud with an overall 10+ years of experience. My areas of interest are AI, Cloud, and Distributed Systems. I am also an open-source contributor and tech enthusiast, like to explore different technologies from Linux kernel to different cloud platforms. I am a maintainer of the FaaS opensource platform Fission and contributed to Openstack in past. In AI specifically, I have been dabbling with different AI implementations over the past year, a few of which are in production.

Speaker Links:

Section: Artificial Intelligence and Machine Learning
Type: Talk
Target Audience: Intermediate
Last Updated: