Defending AI models against adversarial attacks (using Python libraries Cleverhans and Foolbox)

deya | 20 Jun, 2019

Description:

Motivation of the talk:

Situation #1:
Imagine a self-driving car on the road: they depend on computational perceptions of images of traffic signals. Now if that image is perturbed with very small imperceptible noise, it would lead the car to interpret the 'stop' signal as 'go' or maybe the 'go straight' as 'go left'. Imagine the kind of chaos and fatalities that would lead to!
Situation #2:
Say you are building a deep learning classifier to predict if a tumor is benign or malignant. If an attacker perturbs very slightly a particular image that the classifier predicted as 'malignant', it would result in what is called adversarial fooling of neural networks : the classifier (even if it is a SOTA neural network) may predict the same one as 'benign'. That would lead to incorrect diagnosis and perhaps cost a life. (Same for a benign tumor caused to be predicted as malignant.)

These are some of the reasons why the topic of privacy and security in AI is so important right now, and we must steer the great amount of buzz around AI to this topic right away.

What the talk will cover:

In this talk, I will discuss the hidden gems of Python libraries (like Cleverhans and Foolbox) that are very important to tackle such adversarial attacks. Adversarial attacks are nothing but addition of small amount of malicious input to a neural network to cause it to misclassify. I will show how these attacks a) compromise confidential and private data and b) fool neural networks to make wrong predictions. I will also demonstrate with code possible attack defenses using Cleverhans and Foolbox.

Outline of the talk:

Self-introduction. Why am I here? Setting expectations for the talk. [1.5 min]
Introduction to adversarial examples: Visual presentation [4-5 min]
Brief on adversarial attacks e.g. FGSM, BIM, targeted & un-targeted [6-7 min]
Real life use-cases where adversarial examples become adversarial threats : Malware, disease diagnosis (FATAL), autonomous cars (ALSO FATAL) [2-3 min]
What is Cleverhans? Success stories (Research) [1 min]
Why is Cleverhans cool? [2 min]
Briefing about other libs as well, esp. Foolbox [5 min]
Demo of the code: adversarial attacks in medical imaging: How to reciprocate results, how to improve results, scope of the work, contribute (anyone is welcome!) [5-7 min]
How to become a Cleverhans contributor? (or the other libs mentioned) [1.5 min]
Q&A. [5 min]

Prerequisites:

Usual prerequisites: Know basic linear algebra, probability and stats + the basics of neural nets
Else, any ML/DL enthusiast can attend. Just walk in with an open mind :) I'll make sure the talk is beginner friendly though it has been marked as intermediate owing to the scope and depth of the topic.

Content URLs:

Slides deck
Video pitch
If some of you are well-versed with reading papers, the following links to these papers will be quite useful:
1. Towards deep learning models resistant to adversarial attacks
2. Defense-gan: Protecting classifiers against adversarial attacks using generative models (subpoint in markdown isn't working :( )

Speaker Info:

Deya is a final-year computer science major and an AI researcher. She is a self-professed Python stan and works to apply AI in social science, healthcare and security problems. She has done several research internships and projects at ISI Kolkata, IIIT-Bangalore, IIIT-Hyderabad, etc. She is also an avid technical writer, having written multiple research papers & editorials, and has presented papers at international conferences. Her ML/DL projects range from computational sustainability and medical imaging to customer satisfaction prediction and handwriting recognition. Deya is also an enthusiastic Kaggler and has achieved top 25% in Kaggle competitions. An UN volunteer, she is passionately committed to the cause of inclusion and diversity in tech and tries in her own small way to help towards this mission.

Speaker Links:

Section:	Data Science, Machine Learning and AI
Type:	Talks
Target Audience:	Intermediate
Last Updated:	01 Sep, 2019

Comments