RAG Brag - Building Production ready LLM apps

JAYITA BHATTACHARYYA (~jayita13) | 24 May, 2024

1

Vote

Description:

Building advanced RAG(retrieval augmented generation) pipelines with LlamaIndex. Delve into the powerhouse of RAG systems to query over your specific knowledge base data. Understanding how enterprise solutions are being built using these booming technologies and learning to use them with code implementation. Upgrade from PoC to production, where performance is a key factor in achieving enhanced results. Let us head over to take up a few building blocks to set up such an advanced RAG pipeline that can be deployed and scaled in real-time.

This flow includes modules for each step, end-to-end pipeline setup from varied data sources (PDFs, text files, APIs, databases, etc.) for data ingestion options, splitting data into chunks by appropriate chunking strategies(for text), converting chunks into vector embeddings (selecting the best-suited embedding model), storing embeddings, be they structured or unstructured, into vector databases(in-memory or cloud) or knowledge graphs DBs, and finally plugging into the query engine for retrieval, focusing on more optimal and better searching methodologies such as hybrid search, metadata extraction or structuring prompt context. Post-retrieval techniques for enhanced performances, such as long-context reordering and reranking models,. Set up agents for specific tasks to make Agentic RAGs.

The final step is to build a deployable chatbot with a readily available UI (Chainlit/ Streamlit /Gradio). This helps uniquely in implementing LLMs in large knowledge bases and augmenting them. Manage and control the data to make informed decisions across business use cases such as in BFSI, legal, healthcare, and more industries. RAG is here to stay & slay! with Copilots built on massive data to automate processes and save from manual time-consuming efforts.

Key Takeaways:

Hands-on implementation of building an end-to-end deployable RAG pipeline.
LLM inference optimization techniques like cost/API calls, caching, chaining, etc.
Efficient ways of using data and language frameworks for RAG systems with tools & agents.
Brief on grounding and hallucination related to LLM and generative AI concepts.

Prerequisites:

Basic Python Programming Basic AI/ML knowledge

Content URLs:

Github repo

Blog

Slideshow

Speaker Info:

This is Jayita Bhattacharyya, an official code-breaker. Juggling as a Senior Associate Consultant at Infosys Center for Emerging Technology Solutions. The work focus these days is on generative AI. Along with the team, helping clients incorporate AI into existing software products by building need-of-the-hour solutions. Passionate in the AI/ML buzzing space and a hackathon wizard, very recently won the Infosys Data For AI Hackathon and Informatica Data Engineering 2024 Hackathon. Love to write blogs on state-of-the-art technologies that I've worked on and share insights.

Speaker Links:

Reach me on Linkedin

Medium Blogs

Check Git Repos

Data Science Content Writing

Talk Show

Section:	Artificial Intelligence and Machine Learning
Type:	Talk
Target Audience:	Intermediate
Last Updated:	24 May, 2024

Comments