Accelerating Transfer learning using Effective Caching and How to Debug TensorFlow programs
Lokesh Kumar T (~tlokeshkumar) |
Description:
Accelerating Transfer Learning using Effective Caching Technique
Transfer Learning is something which has become a routine today. Neural Networks have a lot of parameters (millions of them) which are trained iteratively in a data-driven fashion. With these many parameters come huge representational power (ability to model hyper dimensional complex functions). In cases where we train a custom classifier (say a CNN), we might not be having that much data so the network can easily overfit when trained from scratch. So here comes transfer learning, use the previously accumulated knowledge (in form of weights in neural nets) to learn our problem.
In case of fine-tuning also we will be training final layers of the network only. (If you are not aware don't worry this will be covered). Huge networks take significant time train completely. To reduce this time comes methods of effective caching or informally called Training with Bottlenecks
This method though is easy to implement, can give very good results.
ResNet50 which took 45 sec for an epoch to train using normal transfer learning procedure, now takes 8 sec per epoch. Which is almost
6x
speed up! *
*Trained on Nvidia GeForce GTX 1050, i5-7300HQ Processor (5 category flower dataset)
Learning Outcome
Why is Computer Vision difficult problem?
The role of Deep Learning in Computer Vision
Deep Convolutional Networks for Image recognition
Different Convolutional Architectures for Image recognition
Difficulty in Optimizing large neural nets and hints for effective training
Uses of pretrained models and basis of transfer learning
What is Transfer Learning and why is it important?
- Different methods of Transfer Learning
Accelerating training a neural network by caching the non-trainable model's output (Hands on Implementation in
keras
)Analysing the speedups and potential limitations in this procedure
How to debug Tensorflow Program?
This presentation is not about how to debug DL model (Example DL model is not fitting well). Its about how to debug your program in programming perspective. Debugging a tensorflow program can be difficult due to many reasons out of which some are,
The concept of computational graph construction
Abstraction of
tf.Session()
many more. So we will introduce commonly used tensorflow debugging tools like (main ones are listed)
Tensorboard
Tensorflow Debugger
tfdbg
Prerequisites:
Basic understanding of
Deep Learning
,Tensorflow
andkeras
Working knowledge of
python
Content URLs:
Fast Image classification using Bottlenecks
Content for Tensorflow Debugging Topic (Under development)
Speaker Info:
R S Nikhil Krishna
Nikhil is a final year student at IIT Madras. He currently leads the Computer Vision and AI team at Detect Technologies and has headed the CVI group at CFI, IIT Madras in the past. In the past, He has worked on semi-autonomous tumour detection for automated brain surgery at the Division of Remote Handling and Robotics, BARC and on importance sampling for accelerated gradient optimization methods applied to Deep Learning at EPFL, Switzerland. His love for python started about 4 years back, with a multitude of computer vision projects like QR code recognition, facial expression identification, etc.
Lokesh Kumar T
Lokesh is a 3rd-year student at IIT Madras. He currently co-heads the CVI group, CFI. He uses Python for Computer Vision, Deep Learning, and Language Analysis. In DeTect technologies, he has worked on automating the chimney and stack inspections using Computer Vision and on on-Board vision-based processing for drones. His interest in python began during his stay at IIT Madras, from institute courses to CVI projects like face recognition, hand gesture control of bots, etc
Speaker Links:
R S Nikhil Krishna
Lokesh Kumar T