Dangers of Large Language Models: How to Mitigate the Risks ?

Abhijeet Kumar (~abhijeet3922)


3

Votes

Description:

Large Language models (LLMs) are powerful tools for natural language understanding and natural language generation tasks. However, they poses number of risks which includes:

  • Toxicity - LLMs can generate toxic and harmful content on crafted adversaries.
  • Bias - LLMs can reflect biases present in the data they are trained on.
  • Privacy - LLMs can leak private information or collect personal data which may not be ethical.
  • Truthfulness - LLMs can coherently generate untruthful false information.

The motivation behind the talk is to raise awareness and critical thinking among the developer community who may be unaware of these issues. Additionally, the idea is also introduce harmless screens and debiasing/detoxification techniques adopted by companies to mitigate such risks. In this talk, I want to broadly cover the topic in two parts.

  • Highlight the risks of LLMs
  • Methods to mitigate these risks.

A rough outline of the talk would be:

  1. Problem Context - 2 minutes
  2. Risks of Large Language Models - 5
  3. (Python) Demos & Examples of Risks - 8-10 minutes
  4. Methods to Mitigate the Risks - 8-10 minutes
  5. Key Takeaways - 2 minutes

Intended Audience The talk is for anyone who is interested in adding Guardrails to LLMs. In particular, it may interest python developers and AI researchers working with LLMs to build responsible AI systems.

Prerequisites:

Following would be pre-requisites for the session.

  1. Basic Knowledge of Natural Language Processing & Generative AI.
  2. Introduction to Language Models, Prompts (not mandatory).
  3. Familiarity with HuggingFace & OpenAI (ChatGPT).

Content URLs:

This GitHub repo contains preliminary notebooks for the workshop. I will continue to refine and develop them until the conference takes place. Live Demo app with be setup before the conference.

References:

  1. On the Dangers of Stochastic Parrots
  2. Evaluating Neural Toxic Degeneration in Language Models
  3. Buy Tesla, Sell Ford: Assessing Implicit Stock Market Preference in Pre-trained Language Models
  4. Perspective AI
  5. AI Gaurdrails
  6. LLM Attacks

Speaker Info:

I am an applied data scientist and research professional with 10+ years of relevant experience in solving problems leveraging advanced analytics, machine learning and deep learning techniques. I started my career as a scientific officer in a govt. research organization (Bhabha Atomic Research Center) and worked on variety of domains such as conversational speech, satellite imagery and texts. Currently, I am working as a Director, Data Scientist with Fidelity investment for last 4 years working on language models (NLP) and Graphs.

As part of my work, I have used python throughout my career for solving data science problems as well as for pursuing research. I have published several academic and applied research papers and participated in multiple conferences over years. In past, I had trained professionals in machine learning and had been guest lecturer at BITS, Pilani, WILP program for Machine Learning subject (MTech course)

Speaker Links:

Blog: https://appliedmachinelearning.wordpress.com/

Github: https://github.com/abhijeet3922

Linkedln: https://www.linkedin.com/in/abhijeet-kumar-1aa8b0138/

Open Source Contributions:

  1. finbert-embedding: https://pypi.org/project/finbert-embedding/
  2. classitransformers: https://pypi.org/project/classitransformers/
  3. PhraseExtraction

Section: Ethics and Philosophy
Type: Talks
Target Audience: Beginner
Last Updated: