Adversarially attack ML models. Now defend against them!

Niharika Shrivastava (~OrionStar25)


1

Vote

Description:

Due to the ever-increasing adoption of AI into the lives of daily users, trustworthy AI is of utmost priority. Even though advocates of AI globally have started talking about ethical considerations during ML model building, in reality, very few people know how to create robust, privacy-preserving, and fair AI models.

In this talk, I'll explore 2 concrete technical concepts of trustworthy AI, namely ensuring robustness and fairness in ML models.

Robustness: 1. Attendees will go through an in-depth understanding of critical vulnerabilities of common AI models and how to exploit them to adversarially attack the model (e.g., inference attacks, data poisoning). 2. This will be followed by simple defence strategies to increase robustness (e.g., gradient obfuscation, transformations). 3. This will be further followed by how one can perform adaptive attacks on previous defence strategies rendering them completely useless! (if time permits).

Fairness: 1. Attendees will get to know how they can unconsciously encode bias (representational bias, model bias, etc) during training AI models. 2. This is followed by simple strategies to correct this bias using domain knowledge.

Prerequisites:

Machine Learning, knowledge of gradients and backpropagation, elementary knowledge of linear algebra

Speaker Info:

I am a postgraduate from the National University of Singapore with a Master's in Computing (Artificial Intelligence Specialization). My interests in the field of AI are diverse, ranging from Natural Language Processing to Applied Data Science, and Robotics. Currently, I work as a Data Scientist in the intersection of forecasting, optimization and advanced reward-based AI models.

Fun fact: I've been a previous speaker at PyCon India in 2019!

Speaker Links:

Past speaking engagements: Link Website: Link Blog: Link

Section: Artificial Intelligence and Machine Learning
Type: Talk
Target Audience: Intermediate
Last Updated: