Pyro Demystified : Bayesian Deep Learning

Introduction

The ability to estimate uncertainty and incorporate it in decision making is crucial to sensitive applications like Self-driving Cars, (Semi-) Autonomous Surgery, Personalized Medicine, etc., where the consequences of a wrong decision is potentially catastrophic. A Probabilistic Program is the natural way to model such processes.

Pyro is a probabilistic programming language built on top of PyTorch. Pyro is built to support Bayesian Deep Learning which combines the expressive power of Deep Neural Networks and the mathematically sound framework of Bayesian Modeling.

Objective

The objective of this session is to introduce the audience to Bayesian Modeling in Pyro, which provides a powerful set of abstractions for Probabilistic Modeling and Inference.

• A 4-step Bayesian Modeling process is introduced by considering Bayesian Linear Regression as an example.
• The performance of the model is evaluated by visualizing the learned parameters with uncertainty estimates.
• This process is applied to different models in different settings, each of which explores a unique aspect of Bayesian Modeling.

Models in consideration:

Detailed Outline

• Bayesian Modeling in Pyro ( 60 mins )
• Model Definition
• Guide Creation
• Inference
• Evaluation
• Generative Models
• Supervised Learning (25 mins)
• Unsupervised Learning (25 mins)
• Semi-supervised Learning (25 mins)

Bayesian Modeling in Pyro

Pyro supports Probabilistic Modeling and Inference through a set of effects, sample and param, and a library of Effect Handlers, poutine. Probabilistic Modeling typically consists of the following steps:

1. Model Definition
2. Guide Creation
3. Inference
4. Evaluation

Model Definition

A model is a stochastic function composed of deterministic statements combined with randomness. The primary source of randomness in Pyro, comes from distributions, a module in Pyro. A Distribution object represents a probability distribution. The function sample is an effectful statement that samples from a Distribution object. It is effectful because it has side-effects in addition to its primary purpose of returning a sample from a given distribution, such as enabling the Effect Handlers to keep track of "sample sites" in the model and change the model behaviour at runtime as necessary.

Guide Creation

The idea is to infer the posterior distributions over the latent variables. Variational Inference as implemented in SVI is the workhorse of inference in Pyro.

In Variational Inference, a family of distributions Q (with "nice" properties) is considered as a Variational Approximation to the true posterior. The Variational Distribution is optimized to minimise the KL-divergence to the exact posterior over the unknowns.

This Variational Distribution is encoded as a stochastic function (guide) using pyro's sample and param statements. The inference algorithm identifies and aligns the "sample sites" in the model and the guide, and keeps track of variational parameters as defined using the param statement.

Inference

Stochastic Variational Inference (SVI) is Pyro's general purpose inference algorithm. SVI takes gradient steps iteratively, to reduce the (negative) ELBO objective, which is equivalent to reducing the KL-divergence between the true posterior over the latent variables and our approximate Variational Distribution (guide).

Evaluation

The posterior predictive distribution over the outcome variable is estimated and plotted. A good model must be able to account for most of the data points. If the model fails to explain the data points, the model specification must be rewritten to compensate for the unexplained data points.

Generative Models

Supervised Learning — Uncertainty Estimation

So far, the simplest regression setting, Bayesian Linear Regression with a toy dataset, has been considered, to understand Bayesian Modeling and the mechanics of Pyro. In this section, a Bayesian Neural Network (BNN) is trained on the MNIST dataset. The model's performance on the MNIST test set and Fashion MNIST is explored.

Unsupervised Learning — Expressive Power

Variational Autoencoder (VAE) is the simplest setting for Deep Probabilistic Modeling. In this section, a neural network based VAE is implemented in Pyro. The model is trained on the Fashion MNIST dataset. New images are sampled from the decoder module as a demonstration.

Semi-supervised Learning — Handling Missing Data

In a semi-supervised setting, some of the data points are labelled and some are not. In a generative model, the missing data can be accounted for, quite naturally. A Semi-Supervised VAE (SS-VAE) is trained on MNIST with some of the labels randomly removed. New images conditioned on these labels are generated to show the model's performance.

Intended Audience

Machine Learners interested in exploring Bayesian Deep Learning.

Desired Outcomes

• Understand the need for Bayesian Deep Learning
• Learn to use Pyro's Modeling and Inference toolbox
• Learn Uncertainty Estimation
• Learn to build Deep Probabilistic models in Pyro

Prerequisites:

1. A decent laptop ( > 4 GB RAM )
2. python 3.x installed
3. Latest versions of pyro, pytorch, matplotlib, pandas
4. Basic Knowledge of PyTorch, Bayes Rule

Speaker Info:

Suriyadeepan Ramamoorthy, Research Engineer at Saama Technologies.