Data Catalog enrichment using Generative AI

Sasidhar Donaparthi (~sasidhar)


1

Vote

Description:

One of the major challenges is to keep the enterprise data catalog up-to date with proper meta data. This needs lot of effort and time from the data management/governance teams. Only the subject matter experts can curate this data. I would like to present how we have used Generative AI to create the column and table descriptions that were missing in the data catalog. I will be covering the following topics in my talk

  • What is data catalog and what are current challenges of managing meta data?
  • Various LLMs and Generative AI techniques used to arrive at the optimal solution.
  • Challenges in evaluating the model output systematically and come-up with validation metrics.
  • How to win the confidence from the end users
  • Lessons Learnt

This is the use case we are actively pursuing at our company and I would like to share the learnings and challenges we have faced.

Prerequisites:

Basic understanding of LLMs and Basic knowledge of Python

Content URLs:

I can't make the content public due to the restrictions of information security policies at the compnay. Glad to provide answers to any of the questions. I will be working with my company to create the content and get the necessary approvals and then share the presentation here.

Speaker Info:

I am a mechanical engineering graduate with 25+ years of experience in manufacturing and financial services domains, I have started my career as design engineer in hydraulic turbine manufacturing company. After spending 5 years, I have stated my IT journey at Aspect Development/i2 Technology. I have worked primarily on data scrubbing, modelling, analysis and data migration projects for supply chain management. I then joined technology services side of Fidelity, financial services company and currently working as data scientist. I have been using python for last 6+ years for automation, data analysis, data science, web development, etc. I am very excited about the endless opportunities that arise in day today work and application of python for solving problems, automating day to day activities. I conduct regular training sessions for data analys ( numpy, pandas, scikit-learn and matplotlib) in my company.

I am a regular speaker at Pycon India conference. I have done various talks and workshops in Pycon 2017, 2018, 2022 and 2023

Speaker Links:

github link - https://github.com/sdonapar

linkedin profile - https://www.linkedin.com/in/sasidonaparthi

twitter handle - @sdonapar

Section: Artificial Intelligence and Machine Learning
Type: Talk
Target Audience: Intermediate
Last Updated: