DevOps for Machine Learning: Deploying ML Models at Scale

Prabakaran Kumaresshan (~prabakaran16) | 10 Jul, 2018

16

Votes

Description:

I swear by the Dutch, this is not an ML Workshop*

If you are one of the Cool Kids doing Style Transfer, Visual Translation or lurking at arxiv-sanity for what is hot, but wondering how would you take the model beyond Jupyter notebooks?

It is my impression that the world of deep learning research is starting to plateau. What's booming: deploying DL to real-world problems.
François Chollet

I trod the same path when I started as a founding ML Engineer, over the past two years I have learned that solid engineering is essential for building ML Application at web scale. Productionizing ML model is the last mile journey, the most dreaded and less talked about topic, knowing the right toolchain to automate your build pipeline is essential for APIfiying your ML Models.

Typical ML pipeline is accompanied by a big data infrastructure to de-normalize and preprocess the application data to prepare training data, then a microservice to expose the trained model artifact on a runtime component as a service.

ml pipeline

In this workshop, we will explore the * DevOps toolchain* to build, train, test, deploy and monitor an ML Model. The focus will be on the toolchain and how to automate the entire process from commit to deployment.

To illustrate the whole process we would build a toy recommendation application for an on-demand streaming service provide Pyflix.

Application Architecture:

Here is the reference Application Architecture for our Pyflix Recommendation Engine.

Pyflix** Recommendation Engine

Tentative Agenda

Introduction to DevOps Culture
Quick Introduction to ML/Big Data tools used in the Application - PySpark, Scikit-Learn (if required)
Introduction to Containers and Cloud Infrastructure (Docker and AWS)
Introduction to Infrastructure as Code (Terraform and Ansible)
Building CI/CD pipeline with Jenkins
Building Data Pipeline with Airflow
Building RESTful Service with Django Rest Framework
Application Architecture Introduction - Pyflix
Putting All Together to Build Recommendation Engine

Prerequisites:

The workshop will spin around DevOps tools to build ML Pipeline. We will implement a rudimentary recommendation engine so a basic understanding of ML is enough. We will start with an introduction to DevOps and tools used, however good understanding of DevOps culture will help participants get the most out of the workshop.

The edx course on DevOps by Microsoft is a great resource, but not necessary for this workshop.

The Demo could be set up either in local with Docker or in the cloud.

Basic understanding of Containers
Basic understanding of Cloud Infrastructure (AWS)
Basic understanding of ML/BigData(PySpark)
A little bit of googling on Jenkins and Airflow will help

Required Tools

For local demo
- A Linux PC with preferably 8GB Ram, Windows or Mac users needs to perform some additional steps to install Docker.
- Docker
- Docker Compose
For AWS
- awscli with configured credentials
- Terraform

Content URLs:

Will Update Shortly

Speaker Info:

By profession, Prabakaran Kumaressha designs algorithms to score complex user interactions, classify use generated contents, derive insights and APIfying them to run at scale. He has been data wrangling for 5+ years, specialized in NLP, uses Jupyter to analyze data that fits his PC memory, PySpark for anything that doesn't, uses Django+DRF to create microservices embracing DevOps culture, mostly on AWS. Occasionally he gives talks at local meetups.

Speaker Links:

@iPrabakaran
Github
LinkedIn

Section:	Developer tools and Automation
Type:	Workshops
Target Audience:	Intermediate
Last Updated:	23 Jul, 2018

Comments