Quantiziting of Neural Networks for Edge Computing

archana iyer (~archana52)




Imagine a play in a small theatre, where you are a producer sitting with the audience. Let us suppose the actors are weights and there are rows and rows of TPUs/GPUs behind. The director has assured you that they have rehearsed the play about 10 times, now all you do is pray that the performance goes well Imagine you have 100 different tasks to be performed backstage, but the theatre given to you is really tiny. How will you manage?

The answer is by optimizing the tasks. Divide tasks between individuals in such a way that you require less time and space.

But how do you manage that with a neural network? How does quantization affect neural computations?

When you are dealing with a large amount of data, one has to keep in mind the ever-changing values that one might obtain. Especially, signal data with large SNR (Signal to Noise Ratio) in them, which causes different sets of data to be produced. The best way to deal with such signal data is to apply truncation or rounding off such values, typically making it a many-to-few mapping. This mapping happens from 32-bit(at training) to 8-bit(at inference).

On the other hand, traditional Internet of Things (IoT) infrastructures has two main parts – the edge and the cloud. The edge is the part of the system that is closest to the source of data. It includes sensors, sensing infrastructures, machines, and the object being sensed. The edge actively works to sense, store, and send that data to the cloud.

So how does quantization help with edge computing? Does it have the potential of changing how we run models on the cloud?


  • Introduction: 5 mins
  • A Beginner's Guide to Quantization: 5 mins
  • Understanding Quantization in TPUs: 10 mins
  • Demo: Implementation of Quantization in Edge computing: 10 mins


  • Knowledge on Tensor Processing Unit .
  • Knowledge of IoT and Edge Computing
  • Knowledge of Deep Learning

Content URLs:

You can view my blogs
1. Women and Data Science
2. Quantization and need for TPUs
3. Application of signal processing in machine learning

You can view my various slides here

Speaker Info:

I am a fresher from SRM Institute of Science and Technology. I understand that engineering is not everyone's cup of tea and that everyone has a different perception of it. During my second year of study, I realized that for me education was something that was present beyond books and into practical applications. So I collaborated with a few other mates in college and started this place called the Next Tech Lab which was involved in cutting-edge innovation and novel research ideas.

As a few of my achievements that the lab made me achieve included winning the Smart India Hackathon 2017 as the first prize under Ministry of Steel for using machine learning to detect power theft in India. Recently I was invited to the WiPDA conference in Xi'an China for presenting my work in GaN modeling of devices using machine learning, a collaboration with the University of Cambridge. I have around 3 IEEE Xplore Papers (https://ieeexplore.ieee.org/document/8293259/)and 1 Elsevier papers for my contribution to electrical and machine learning fields

As a lab, we have done so much more to protect gender diversity even among the strength of 200 members keeping a ratio of 50:50. We were portrayed for accomplishments by the News 18 in a short video.

Over the past 6 months, I have had the opportunity to work and intern at Saama Technologies where I research on Machine Learning in order to accelerate clinical trials. A part of this work has exposed me to how models are necessary to be optimized across all devices big or small.

Id: 962
Section: Embedded python
Type: Talks
Target Audience: Intermediate
Last Updated: