# Bayesian Learning using Python

Abinash Panda (~abinashpanda)

# 31

### Introduction

• The aim of this workshop is to introduce users to the Bayesian approach of statistical modeling and analysis.
• In this workshop we would be covering
• Markov Chain Monte Carlo (MCMC). These methods are a class of algorithms for sampling from a probability distribution based on constructing a Markov chain.
• Probabilistic Graphical Models (PGMs). It is a technique of machine learning which uses a network structure of random variables and smaller conditional probability distributions to represent the Joint distribution over all the variables. Inference/prediction can be performed over these models by conditioning it over the given values and computing the conditional probability distribution.
• This talk mainly focuses on
• usage of `PyMC` for MCMC
• classification problems using Bayesian Models and how to work with them using `pgmpy`.

### Tutorial

• Introduction to SciPy stack

• Introduction to numpy
• Brief introduction to scipy
• Visualization using matplotlib
• Introduction to Probability Theory

• Basics of probability theory with hands-on excercise
• Bayesianist vs Frequentist
• Thinking like Bayesianist

• Bayes Theorem and hands-on excercise
• Introduction to Markov Chain Monte Carlo (MCMC)
• MCMC using PyMC
• Probabilistic Graphical Models (PGMs)

• Introduction to PGMs
• PGMs using pgmpy
• Creating models using pgmpy
• Parameterizing the model
• Asking questions to the model: Inference
• Special graphical models
• Naive Bayes Models
• Hidden Markov Models

### Take home Bonus

• How to compute optimal parameters for the models
• How to construct the network structure from the data if we don't have any domain knowledge.

#### Prerequisites:

• Basic knowledge of Python
• Knowledge of SciPy stack would be an added bonus (optional).

#### Content URLs:

• Initial draft of the presentation: http://nbviewer.ipython.org/github/pgmpy/pgmpy_notebook/blob/master/Probabilistic%20Graphical%20Models%20using%20pgmpy.ipynb
• PyMC: https://pymc-devs.github.io/pymc/
• Pgmpy: http://pgmpy.org
• Numpy: http://www.numpy.org/
• Scipy: http://docs.scipy.org/doc/scipy/reference/
• Matplotlib: http://matplotlib.org/index.html

#### Speaker Info:

Abinash Panda is an undergraduate from IIT (BHU) Varanasi and is currently working as a Data Scientist. He has been a contributor to open-source libraries such as Shogun Machine Learning Toolbox and pgmpy which he started writing along with four other members. He is spending most of his free time in improving pgmpy and helping new contributors.

Ankur Ankan is a B.Tech graduate from IIT Varanasi who is currently working in the field of data science. He is an open source enthusiast and his major work includes starting pgmpy with four other members. Presently he is working on improving the performance of pgmpy and also mentoring GSoC students participating under pgmpy. In his free time he likes to participate in Kaggle competitions.