Safeguarding Privacy with NLP: Leveraging Topic Modeling for Ethical AI

Abhiram Ravikumar (~abhiram89)




In the field of conversational AI, ensuring ethical interactions and safeguarding user privacy are crucial. This presentation proposes the innovative use of topic-mining techniques to enhance the ethical standards of AI-driven chatbots. Large datasets from resources like AmbigQA and CoQA are analysed, where key topics linked to unethical or inaccurate responses are identified and clustered. These topic clusters are then utilized to implement corrective measures like moderation, ethical reframing, and ultimately, enhancing the trustworthiness of AI communications.

The aim is to showcase how topic mining could be a vital tool in upholding high ethical standards in AI applications, ensuring the integrity of privacy and accuracy.

Basic outline of the talk

  1. Intro to AI Ethics and Privacy Concerns [4-5 minutes]
  2. How Topic Modeling Works [8-10 minutes] 2.1. BERT based models 2.2. LLM based models 2.3. Analysing topics of interest in AmbigQA and CoQA
  3. Clustering unethical or inaccurate responses [4-5 minutes]
  4. Demo of the topic modeling process [2-3 minutes]
  5. Implementing Corrective Measures through Topic Clusters [5 minutes]
  6. Getting started with AI Ethics with Open Ethics Initiative [ 2 minutes]
  7. Q & A [2-3 minutes]

Who is this talk for?

  • Data NLP Professionals like Data Scientists, Engineers and Machine Learning engineers seeking to apply topic mining
  • AI Ethics enthusiasts and advocates
  • Python developers who want to explore about topic modelling

Key takeaways

  • Gain a clear understanding of how topic mining can identify unethical or inaccurate responses in AI interactions
  • Learn how conversation datasets like AmbigQA and CoQA can be analysed using Python libraries
  • Discover strategies and tools to deal with unethical topic clusters by applying guardrails


  • Basic understanding of Natural Language Processing (NLP)

    • Familiarity with Python and Pandas

    • Familiarity with Ethics and Privacy

Speaker Info:

Abhiram is a Data Science and Cloud Machine Learning Engineer at Collinson's specialist Data Science team. He holds a Master's degree in Data Science from King's College, London, and has an extensive background in natural language processing, quantum computing, brain-computer interfaces, and AI. A seasoned speaker and member of the Mozilla Tech Speakers program, Abhiram has presented at international tech conferences like PyCon, MozFest, and CodeMash, and his LinkedIn Learning course on Rust programming has reached over 60,000 participants.

Abhiram has a rich history in both academia and industry. He has published papers and posters at IEEE and ACM research conferences, and prior to his current role, he spent over four years as a developer and research fellow at SAP Labs in Bengaluru, where he specialized in web development, computer vision, and robotic process automation (RPA).

With practical experience in developing, testing, and deploying NLP products, including the application of the BERTopic topic modelling technique, Abhiram is well-positioned to provide deep insights into the changing landscape of topic modelling due to Large Language Models. His recent talk at the Analytics Vidhya DataHour Forum Talk series on clustering topic models, attended by over 4,200 participants, received an impressive feedback rating of 4.6/5, underscoring his ability to effectively communicate complex topics to diverse audiences.

Speaker Links:

Events and speaking engagements

Online presence

Section: Artificial Intelligence and Machine Learning
Type: Talk
Target Audience: Intermediate
Last Updated: