Developing Robust Data Science / Machine learning Pipeline using scitkit-learn pipeline

pawan.singhiitm | 30 Aug, 2017

7

Votes

Description:

In data driven world, a Data Scientist should not be only developing models which works well on key KPIs but these models should be easily integrate-able in the large software which its going to power. In this talk, we show how Data Scientist can leverage various functionality of scikit-learn to build DS/ML pipeline which are - 1. Robust - helps them in experimenting better with various combination of data processing and models, 2. Easily deployable without much overhead from development team, and 3. Less prone to simple errors

Prerequisites:

basic understanding of how DS/ML works
basic understanding of how DS/ML fits in larger software paradigm
basic understanding of scikit-learn
good understanding of class, objects

Content URLs:

GitHub Link of Jupyter notebook which contains material for the talk

Speaker Info:

Pawan is a Data Scientist/ Machine Learning practitioner with over 5 years of experience in domain varied across Aviation, Digital Marketing to Retail. He is currently working as Data Scientist at JDA Software, where he leads the team of 5 Data Scientist and builds models which helps retailers in planning better. He holds Bachelors and Masters degree from Indian Institute of Technology, Madras.

Section:	Standard library
Type:	Talks
Target Audience:	Intermediate
Last Updated:	30 Aug, 2017

Comments