# Pyro Demystified : Bayesian Deep Learning

**
suriyadeepan
** |
** **

**Description:**

## Introduction

The ability to estimate uncertainty and incorporate it in decision making is crucial to sensitive applications like Self-driving Cars, (Semi-) Autonomous Surgery, Personalized Medicine, etc., where the consequences of a wrong decision is potentially catastrophic. A Probabilistic Program is the natural way to model such processes.

Pyro is a probabilistic programming language built on top of PyTorch. Pyro is built to support *Bayesian Deep Learning* which combines the expressive power of Deep Neural Networks and the mathematically sound framework of Bayesian Modeling.

## Objective

The objective of this session is to introduce the audience to Bayesian Modeling in Pyro, which provides a powerful set of abstractions for Probabilistic Modeling and Inference.

- A 4-step Bayesian Modeling process is introduced by considering Bayesian Linear Regression as an example.
- The performance of the model is evaluated by visualizing the learned parameters with uncertainty estimates.
- This process is applied to different models in different settings, each of which explores a unique aspect of Bayesian Modeling.

Models in consideration:

*Bayesian Neural Network (BNN)*: The BNN is trained on MNIST using Pyro and the model's behavior is analysed on the MNIST test set and Fashion MNIST.*Variational Autoencoder (VAE)*: The VAE is used to generate images, in an unsupervised setting.*Semi-Supervised VAE (SS-VAE)*: The SS-VAE is trained on MNIST with missing data, to understand how a bayesian model deals with missing labels.

## Detailed Outline

- Bayesian Modeling in Pyro ( 60 mins )
- Model Definition
- Guide Creation
- Inference
- Evaluation

- Generative Models
- Supervised Learning (25 mins)
- Unsupervised Learning (25 mins)
- Semi-supervised Learning (25 mins)

## Bayesian Modeling in Pyro

Pyro supports Probabilistic Modeling and Inference through a set of effects, `sample`

and `param`

, and a library of Effect Handlers, `poutine`

. Probabilistic Modeling typically consists of the following steps:

- Model Definition
- Guide Creation
- Inference
- Evaluation

### Model Definition

A model is a *stochastic function* composed of *deterministic statements* combined with *randomness*. The primary source of randomness in Pyro, comes from `distributions`

, a module in Pyro. A `Distribution`

object represents a probability distribution. The function `sample`

is an *effectful* statement that samples from a `Distribution`

object. It is *effectful* because it has side-effects in addition to its primary purpose of returning a sample from a given distribution, such as enabling the Effect Handlers to keep track of "sample sites" in the model and change the model behaviour at runtime as necessary.

### Guide Creation

The idea is to infer the posterior distributions over the latent variables. Variational Inference as implemented in `SVI`

is the workhorse of inference in Pyro.

In Variational Inference, a family of distributions `Q`

(with "nice" properties) is considered as a *Variational Approximation* to the true posterior. The *Variational Distribution* is optimized to minimise the KL-divergence to the exact posterior over the unknowns.

This *Variational Distribution* is encoded as a stochastic function (*guide*) using pyro's `sample`

and `param`

statements. The inference algorithm identifies and aligns the "sample sites" in the model and the *guide*, and keeps track of variational parameters as defined using the `param`

statement.

### Inference

Stochastic Variational Inference (`SVI`

) is Pyro's general purpose inference algorithm. `SVI`

takes gradient steps iteratively, to reduce the (negative) ELBO objective, which is equivalent to reducing the KL-divergence between the true posterior over the latent variables and our approximate *Variational Distribution* (*guide*).

### Evaluation

The posterior predictive distribution over the outcome variable is estimated and plotted. A good model must be able to account for most of the data points. If the model fails to explain the data points, the model specification must be rewritten to compensate for the unexplained data points.

## Generative Models

### Supervised Learning — *Uncertainty Estimation*

So far, the simplest regression setting, Bayesian Linear Regression with a toy dataset, has been considered, to understand Bayesian Modeling and the mechanics of Pyro. In this section, a Bayesian Neural Network (BNN) is trained on the MNIST dataset. The model's performance on the MNIST test set and Fashion MNIST is explored.

### Unsupervised Learning — *Expressive Power*

Variational Autoencoder (VAE) is the simplest setting for Deep Probabilistic Modeling. In this section, a neural network based VAE is implemented in Pyro. The model is trained on the Fashion MNIST dataset. New images are sampled from the decoder module as a demonstration.

### Semi-supervised Learning — *Handling Missing Data*

In a semi-supervised setting, some of the data points are labelled and some are not. In a generative model, the missing data can be accounted for, quite naturally. A Semi-Supervised VAE (SS-VAE) is trained on MNIST with some of the labels randomly removed. New images conditioned on these labels are generated to show the model's performance.

## References

- Bayesian Linear Regression
- Bayesian Neural Network (BNN)
- Variational Autoencoder (VAE)
- Poutine : Programming with Effect Handlers in Pyro
- Stochastic Function
- A Beginner's Guide to Variational Methods
- Stochastic Variational Inference
- Mean-Field Approximation
- Variational Lower Bound
- Posterior Predictive Distribution

## Intended Audience

Machine Learners interested in exploring Bayesian Deep Learning.

## Desired Outcomes

- Understand the need for Bayesian Deep Learning
- Learn to use Pyro's Modeling and Inference toolbox
- Learn Uncertainty Estimation
- Learn to build Deep Probabilistic models in Pyro

**Prerequisites:**

- A decent laptop ( > 4 GB RAM )
- python 3.x installed
- Latest versions of pyro, pytorch, matplotlib, pandas
- Basic Knowledge of PyTorch, Bayes Rule

**Content URLs:**

**Speaker Info:**

Suriyadeepan Ramamoorthy, Research Engineer at Saama Technologies.