Lessons from the frontlines in Data Science
Rajesh RS (~rajesh7) |
Today we see constantly shifting goal posts for data science teams, so much so that certain good practices are not emphasized often enough. In this talk, I wish to discuss some of the lessons from doing data science hands on for five years, and some ideas and universal principles that are still relevant in 2019 - and hopefully will be relevant for several years in future.
This talk is not merely Python and development centric, but also contains good data science practices relevant to the statistically and mathematically inclined among us who are practicing data science professionals in their day jobs.
While this talk will discuss good practices in data science and ML from a process standpoint (data preparation, exploration, visualization, model development, evaluation and deployment), it will also address some of the key paradigms in data science, from the author's experiences on the frontlines doing data science hands on, and from his experience in leading data science teams for solution and product development.
Topics currently under consideration by the author for inclusion in the talk:
- Cloud-centricity of data science workflows
- Data pipelines and their importance
- Continuous solution delivery
- Reproducibility and model drift
- Managing algorithmic bias
- Performance metrics for unusual models
- Physical layer deployment
Practicing data scientists or data analysts who have written Python based data science code will be able to appreciate many of the ideas in this talk. Those interested in data science are welcome, but some knowledge of having done data science will definitely help.
Experienced data science and advanced analytics professional with practical experience in statistical analysis, large scale data science and engineering and machine learning including deep learning. Solved interesting problems in industries such as banking, financial services, manufacturing, telecommunications and the energy sector, in the spaces of computer vision, time series modeling and helped envision award-winning AI and IoT solutions. Past experience in statistical problem solving, quality/reliability engineering, engineering design and process optimization, in diverse industries.
- Website: http://rexplorations.wordpress.com
- Github: http://github.com/rexplorations
- Twitter: http://twitter.com/rexplorations