Reasoning under uncertainty with Python
Ronojoy Adhikari (~ronojoy) |
Uncertainty is a fundamental - and unavoidable - feature of daily life. In order to deal with uncertainty intelligently, we need to be able to represent it and reason about it. - Joseph Halpern
The problem of reasoning rationally in the face of uncertain knowledge presents itself in domains as diverse as engineering, medicine, ecology, and business. The pragmatic solution to this problem is to combine Boolean logic (which is the mathematics of certain propositions) with Bayesian probability (which is the mathematics of uncertainty). This combination provides us with a powerful semantic framework that allows us to
- represent uncertainty in the complex situations that appear in daily life
- reason about these complex situations and draw inferences
- take rational decisions based on the inferences we draw
- and automatically discover causal connections between phenomena we observe
Given this tremendous expressive power, it is not surprising that many of today's cutting-edge machine learning, artificial intelligence and data science, algorithms are based on a combination of logic and probability. The emerging field of probabilistic programming attempts to provide a coherent computational framework for reasoning under uncertainty. In this workshop, we want to introduce these ideas to the Python programming community.
The aim of the workshop is to provide a short theoretical introduction to reasoning under uncertainty followed by a long hands-on session where the theory is fleshed out by analyzing real life situations, casting them as problems of reasoning under uncertainty, and solving the problems using Python tools.
The workshop will prepare participants to take on challenges in machine learning, artificial intelligence, and data science.
We want to use standard tools in the Python data and visualization toolchain to accomplish the aim above. Instead of using a monolithic toolkit, we want to combine the building blocks provided by the excellent Python modules mentioned below to show the participant how they can create their own cutting-edge reasoning, inference, and learning algorithms.
- Scipy stack - for all statistical operations
- Pandas - as a data container for all real-life data
- NetworkX - for visualizing the causal connections inferred from real life data
- IPython notebook - for interactive, exploratory and reproducible work flow
It is recommended that all participants have the Anaconda distribution pre-installed on their laptops.
The theoretical introduction will cover
- Reasoning with Boolean logic
- Uncertainty and Bayesian probability theory
- Uncertainty and complexity - Bayesian probability on graphs
- Inference on graphs
- Learning causal graphs
- Analyzing real life problem using probabilities on graphs
- Taking decisions in the face of uncertainty
The hands-on will cover
- Boolean logic and probability theory using the Scipy stack
- Pandas data frame as a data container
- Building probabilistic graphical models from Pandas data frames
- Visualizing probabilistic graphical models using NetworkX
- Putting it together : real life applications
The real life applications we want to present are
- Should I switch ? - the Monty Hall game show problem, a surprisingly subtle instance of reasoning under uncertainty. Correct decisions could make you rich!
- How much premium should I pay ? - insurance companies charge premiums based on your age, gender, and a number of other attributes. What goes on under the hood in the algorithms which decide your premium ? Build your own premium model and find out how much you should pay!
- What's wrong with my printer ? - Bill Gates had famously said that "Microsoft's competitive advantage lies in its expertise in Bayesian networks". Analyze the Microsoft printer fault wizard and implement a variant for your own fault analysis problem.
NOTE : the examples above can be revised depending on feedback.
At the end of the workshop, participants will be able to analyze a problem in their domain to decide if it can be treated as a problem of reasoning under uncertainty - surprisingly many problems can! They will be able to formulate their problem in standard Python tools (Scipy, Pandas, NetworkX) and arrive at answers to questions of inference, learning and decision making. They will be able to present these answers in an appealing visual form, ready for dissemination on the web.
This workshop will be conducted jointly by two speakers.
Ronojoy Adhikari is a professor of Physics at The Institute of Mathematical Sciences, Chennai. His research interests include the physics of materials, reasoning under uncertainty, and causal knowledge discovery. Ronojoy is a recipient of the Google Research Award for Machine Learning. He is a strong advocate of the use of Python in academia.
- PhD (Physics) – Indian Institute of Science (2003)
- Postdoctoral Research – University of Edinburgh (2003 – 2006)
- Google Research Award for Machine Learning (2010)
- Cambridge-Hamied Visiting Lecturer – Cambridge University (2012)
- Indo-US Science and Technology Fellow – New York University and Princeton University (2013)
Dorai Thodla is a serial entrepreneur with a couple of startups in India and US. His current product initiatives are all based on Python/Django. He is also a Python evangelist and teaches Python in colleges.
Dorai is currently part of the ValuefromData initiative along with Ronojoy Adhikari and Future Focus Infotech. ValuefromData is focused on tools to learn from data and provide actionable insights. We plan to use Python and some of the available libraries to help people leverage the growing amount of data from both internal and external sources.
- Innovation Catalyst at Future Focus Infotech
- Innovation Mentor for Hindustan University and KCG College of Technology
- Startup mentor for KCG Technology Business Incubator
- Part of TiE Chennai Mentoring Group
- Founder Technology Strategies, LLC, California
- Founding member of BuilldSkills.org