Kedro + MLFlow – an open-source integration with Hooks
Lais Carvalho (~laisbsc) |
Kedro is an open-source Python development workflow framework that implements software engineering best-practices for data and machine-learning pipelines; it's sometimes described as the React or Django of Data Science. MLFlow is a model tracking platform open-sourced by Databricks that manages Machine Learning lifecycles, including experimentation, reproducibility, deployment and a central model registry.
Kedro and MLFlow are complementary; both open-source frameworks can be used together to solve orthogonal problems. While the first provides a seamless developer experience through data abstraction and code organisation, the latter excels in model tracking and metrics visualisation.
In this talk, the speaker will introduce both frameworks, explain how the platforms interact in a project lifecycle and demonstrate how Hooks can be used to manage the integration between Kedro and MLFlow to add model and experimentation tracking to your pipeline. Hooks are a mechanism to allows the user to extend Kedro main execution in an easy and consistent manner.
Basic outline of the talk
- Introduction to Kedro 5 Minutes;
- Introduction to MLflow 5 Minutes;
- How MLflow complements Kedro during Project lifecycle 5 Minutes;
- Overview of hooks and integration 5 Minutes;
- Live Demo 5 Minutes;
- Audience QnA 5 Minutes.
During this session, we will provide an overview of Kedro and MLFlow Kedro and how both open-source frameworks can be used together to solve orthogonal problems. While the first provides a seamless developer experience through data abstraction and code organisation, the latter excels in model tracking and metrics visualisation. To conclude, the audience will learn how to leverage recently introduced Hooks for a smoother integration with the overall project.
Data Scientists and Machine-Learning Engineers who deal with Production lifecycles. Data Science Stakeholders who are interested in knowing how to simplify their workflow.
Slides for the talk can be found here:
If you want to make most out of this talk, please have a look at the documentation beforehand:
- https://kedro.readthedocs.io/en/stable/02_getting_started/04_hello_world.html for Kedro and
https://mlflow.org/docs/latest/quickstart.html for MLflow.
Git Hub repo with the demo's code:
This talk in inspired by the latest feature addition (Hooks) within Kedro, will be referencing:
- https://medium.com/quantumblack/introducing-kedro-hooks-fd5bc4c03ff5 and
- https://kedro.readthedocs.io/en/stable/04_user_guide/15_hooks.html to demo the integration.
Mr Kuldeep Singh is a seasoned AI practitioner and ML architect, skilled in Data Science, DevOps Implementation and Cloud Infrastructure setup with an Agile mindset. With 12+ years of experience, he brings a unique blend of expertise that assist him to integrate the business domain with traditional as well as emerging new IT landscape. He has been involved with the world's leading consulting and tech firms and holds an MBA from IMT along with PGP in Big Data and Machine Learning from Great Lakes.
- Linkedin: https://www.linkedin.com/in/iamkuldeepsingh/
- Github: https://github.com/iamkuldeepsingh
- Twitter: https://twitter.com/kuldeepsinghiam