Getting the best out of GPUs for Deep learning using Python

Umang Sharma (~umang91)




GPUs are heart of any Deep learning project. CPUs can't alone run these compute heavy algorithms hence Utilising GPUs for Deep learning becomes important. GPUs traditionally were built for graphics rendering and gaming but their parallel computing capabilities can also be put in Deep learning trainings is what people realised. As we scale up and use a cluster of GPUs utilising all the GPUs for model training becomes a challenge, core Deep learning libraries such as TensorFlow, PyTorch alone can't utilise all the GPUs one needs to add code to do this.

NVIDIA's CUDA library written in C is built to effectively utilise the GPUs but writing machine level C code can be a challenge to any Deep learning Engineer/Data Scientist. Python comes into picture here, python has a framework called Horovod which is created by Uber and shipped as a Python package. Horovod takes care of effectively utilising all the GPUs based on an parallel computing algorithm and makes sure to utilise all the GPUs to near ideal capacities. This talk starts with defining the need for GPUs in deep learning and moves on to explain how Python can make sure all GPUs are utilised in a cluster. We also talk about the algorithms used behind these parallel distributed trainings. We also show the contrasting difference before and after using Horovod. The talk also covers some interesting visualisations that describe the process.

Talk Outline:

  • CPUs vs GPUs for Deep Learning.
  • Why GPUs are important?
  • The Challenges in using GPUs.
  • How Python helps
  • From Python to C++
  • Scaling it up: Using a Cluster of GPUs.
  • Why Vanilla Python TensorFlow Doesn't work
  • How to make TensorFlow Do it?
  • Parallelizing via Parameter Sharing.
  • The Solution: Introducing Horovod.
  • How it works?
  • What are required Code Changes
  • Questions.


  1. Basics of Python
  2. Basics of Deep Learning

Video URL:

Speaker Info:

Umang is a Deep learning Data Scientist at a Big 4 consulting firm. He is also an upcoming author of a Deep learning book. He's led numerous production level Deep learning ,AI projects and enjoys designing end to end solutions with highly optimised workflows using power of Parallelization.He's worked in number of problems in Deep learning from Domain Adaptation to Automatic video captioning, Time series forecasting using Deep learning and many more. He's also an open source contributor to Tensorflow,Datalab. In his free time he enjoys playing golf and polo.

Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: