Build a production-ready database, search engine, and integrate semantic search with OpenAI using PyMongo

Viraj Thakrar (~virajut)


3

Votes

Description:

Modern-generation platforms typically require operational, analytical and relevant search workloads. The system needs to accommodate different tools to manage each of these workloads which becomes daunting tasks for developers, database administrators and system administrators.

A traditional way of having such features in the system introduces an additional layer of data processing, where in Data ETL pipelines, automation and mappings takes place. Additionally, it can’t be almost real-time.

In this workshop, we will explore an integration of MongoDB Atlas, a managed data platform with PyMongo to build a powerful, production ready database, search engine and vector search functionalities. This integration removes the need of separately managing the synchronisation of operational data workload with analytical data and search engine index. With this integration, an application can communicate with database, search engine and can also perform vector search with unified API and connection.

We will cover topics from the configuration to a production ready deployment with a sample application.

Three use cases that we will cover:

  • A primary database with operational workload
  • A search engine with search workload
  • A vector search feature with LLM integration

Key takeaways from this workshop:

  • How to setup and configure a Managed data platform, MongoDB Atlas (15 minutes)
  • How to configure python application driver PyMongo to communicate with MongoDB Atlas (15 minutes)
  • How to perform CRUD operations with PyMongo (20 minutes)
  • How to write and execute aggregation pipelines with PyMongo (Best suited for operational and analytical workloads) (30 minutes)
  • How to plan for indexes and queries (15 minutes)
  • How to configure a search engine and perform search queries with PyMongo (30 minutes)
  • How to configure MongoDB Atlas vector search with OpenAI (30 minutes)
  • What things to consider while doing production deployment of this application (10 minutes)
  • How the application architecture would look like if we do it with traditional approach vs the modern approach of using managed data platforms (10 minutes)

Prerequisites:

  • A system with Conda environment installed
  • Basic knowledge of Python
  • Basic knowledge of Database systems
  • Basic knowledge of Cloud systems
  • Basic knowledge of NoSQL databases

Speaker Info:

Viraj Thakrar, a software engineer, techpreneur from Ahmedabad, Gujarat: is a MongoDB Certified Developer and a MongoDB user since 8+ years. He has experience working with different tech stacks. He has worked with different startups and teams from across the globe. He is founder of Webstring Global Services based in Ahmedabad Gujarat. He is leading MongoDB User Group Ahmedabad with his team mate.

Speaker Links:

LinkedIn

Section: Data Science, AI & ML
Type: Workshops
Target Audience: Intermediate
Last Updated: