Accelerating Transfer learning using Effective Caching and How to Debug TensorFlow programs

Lokesh Kumar T (~tlokeshkumar)




Accelerating Transfer Learning using Effective Caching Technique

Transfer Learning is something which has become a routine today. Neural Networks have a lot of parameters (millions of them) which are trained iteratively in a data-driven fashion. With these many parameters come huge representational power (ability to model hyper dimensional complex functions). In cases where we train a custom classifier (say a CNN), we might not be having that much data so the network can easily overfit when trained from scratch. So here comes transfer learning, use the previously accumulated knowledge (in form of weights in neural nets) to learn our problem.

In case of fine-tuning also we will be training final layers of the network only. (If you are not aware don't worry this will be covered). Huge networks take significant time train completely. To reduce this time comes methods of effective caching or informally called Training with Bottlenecks

This method though is easy to implement, can give very good results.

ResNet50 which took 45 sec for an epoch to train using normal transfer learning procedure, now takes 8 sec per epoch. Which is almost 6x speed up! *

*Trained on Nvidia GeForce GTX 1050, i5-7300HQ Processor (5 category flower dataset)

Learning Outcome

  • Why is Computer Vision difficult problem?

  • The role of Deep Learning in Computer Vision

  • Deep Convolutional Networks for Image recognition

    • Different Convolutional Architectures for Image recognition

    • Difficulty in Optimizing large neural nets and hints for effective training

    • Uses of pretrained models and basis of transfer learning

  • What is Transfer Learning and why is it important?

    • Different methods of Transfer Learning
  • Accelerating training a neural network by caching the non-trainable model's output (Hands on Implementation in keras)

  • Analysing the speedups and potential limitations in this procedure

Training Procedure

How to debug Tensorflow Program?

This presentation is not about how to debug DL model (Example DL model is not fitting well). Its about how to debug your program in programming perspective. Debugging a tensorflow program can be difficult due to many reasons out of which some are,

  • The concept of computational graph construction

  • Abstraction of tf.Session()

many more. So we will introduce commonly used tensorflow debugging tools like (main ones are listed)

  • Tensorboard

  • Tensorflow Debugger tfdbg


  • Basic understanding of Deep Learning, Tensorflow and keras

  • Working knowledge of python

Speaker Info:

R S Nikhil Krishna

Nikhil is a final year student at IIT Madras. He currently leads the Computer Vision and AI team at Detect Technologies and has headed the CVI group at CFI, IIT Madras in the past. In the past, He has worked on semi-autonomous tumour detection for automated brain surgery at the Division of Remote Handling and Robotics, BARC and on importance sampling for accelerated gradient optimization methods applied to Deep Learning at EPFL, Switzerland. His love for python started about 4 years back, with a multitude of computer vision projects like QR code recognition, facial expression identification, etc.

Lokesh Kumar T

Lokesh is a 3rd-year student at IIT Madras. He currently co-heads the CVI group, CFI. He uses Python for Computer Vision, Deep Learning, and Language Analysis. In DeTect technologies, he has worked on automating the chimney and stack inspections using Computer Vision and on on-Board vision-based processing for drones. His interest in python began during his stay at IIT Madras, from institute courses to CVI projects like face recognition, hand gesture control of bots, etc

Speaker Links:

R S Nikhil Krishna

Lokesh Kumar T

Section: Data science
Type: Workshops
Target Audience: Intermediate
Last Updated:

The comment is marked as spam.

Kousik Krishnan (~kousik)
The comment is marked as spam.

Lokesh Kumar T (~tlokeshkumar)

The results you have claimed: "ResNet50 which took 45 sec for an epoch to train using normal transfer learning procedure, now takes 8 sec per epoch. Which is almost 6x speed up!", Is this a comparison of your implementation with tensorflow's implementation?

And also, on what dataset did you get the above results?

Kousik Krishnan (~kousik)

The results that are put here is from the code which I wrote here not that of tensorflow's bottleneck implementation. The normal transfer learning procedure i meant was training without bottlenecks with repeated forward passes.

For creating bottlenecks I used a batch size of 32 which took around 25 sec. Then for training it took around 7sec - 8sec per epoch with batch size of 64 (in the specifications mentioned above). This means we can achieve around 90% within 2-3 epochs (I trained final identity block of resnet50). My approach is quite different from that of tensorflow's but idea was taken from tensorflow. In tensorflow's implementation they are training the last layer of inception. The dataset used was the same dataset that was used in tensorflow's demo (flowers dataset with 5 categories).

Lokesh Kumar T (~tlokeshkumar)

Wow! I would like to see this talk.

jatin raj (~jatin85)

Login to add a new comment.