Model Optimization 101





State of the art Deep Learning models are often heavy in sizes and this makes it extremely challenging when it comes to deploying them. Think of deploying to mobile devices, Raspberry Pis, etc. The compute capability of these won't support those heavyweight models. This talk is going to be about optimizing Deep Learning models to make them eligible for deployment to embedded devices. We will be covering techniques like Pruning, Quantization and along the way, we will also be discussing some practical tips and tricks to make them work really well. These techniques not only help us to develop to lighter, faster, performant models but also help us to make them more energy-efficient.

Following is going to be the outline of the talk:

  • What is model optimization? (2 minutes)
  • Why should we care about it? (1 minute)
  • Different areas for a model to optimize (2 minutes)
  • Mapping areas to optimization techniques (12 minutes)
    • Quantization
    • Pruning
  • Considerations & best practices (12 minutes)
  • Further directions and QA (1 minute)

Here's a reference deck I prepared on this topic:


  • Machine Learning Developers having worked on image models (in Keras).
  • Machine Learning Engineers looking for ways to optimize models for deployment purposes.

Video URL:

Content URLs:

Speaker Info:

I am currently with PyImageSearch where I apply deep learning to solve real-world problems in computer vision and bring some of the solutions to edge devices. I am also responsible for providing Q&A support to PyImageSearch readers.

Previously at DataCamp, I developed projects (here and here), and practice pools (here) for DataCamp. Prior to DataCamp, I have worked at TCS Research and Innovation (TRDDC) on Data Privacy. There, I was a part of TCS’s critically acclaimed GDPR solution called Crystal Ball.

Off the work, I enjoy writing technical articles and talking at developer meetups and conferences. My subject of interest broadly lies in areas like Machine Learning Interpretability, Full-Stack Data Science.

Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: