Autonomous Vehicles See More With Thermal Imaging: Multi-modal thin cross section Object Detection

LAISHA WADHWA (~laisha77)





In the era of AI, there's a renewed focus on making autonomous cars a reality. However, there are many fallbacks like spurious identification of pedestrians as a piece of paper, ramming into bicycles due to missed identification. But the problem can be solved using thermal Images instead of RGB. This talk revolves around how I leveraged thermal images to enhance the vision of autonomous cars!


I'll be talking about Firefly - a thermal image-based object detection module for autonomous cars that I built during the Mercedes Benz Digital Hackathon this year. My learnings and thoughts on why Why Autonomous Vehicles Need Thermal Cameras and why the vehicle industry should no longer remain cool to using thermal imaging.


  • Introduction [5 Minutes]
    • Who am I?
    • Weren't autonomous cars doing good already?
    • What's the buzz around tech4autonomous.
    • The current problem of object detection.
    • Real-World examples of fatal consequences. (Uber and Tesla).
  • Thermal dataset- FLIR dataset - Theory [10 minutes]
    • Why do Thermal Cameras fill in the Autonomous Vehicle Sensor Gap?
    • Why use it?
      • Understanding why it's better than existing LIDAR & camera-based solutions.
    • Processing thermal images.
      • What to do and what not to!
    • Generating Thermal images (image translation - CycleGAN)
  • Data Preparation [6 minutes]
    • How do we use the images?
    • Using CYCLE GAN to generate more data for model enhancement.
  • Building an object detection model in Tensorflow [9 minutes]
    • Pretrained RGB network on PVOC - why it failed.
    • Experimenting with YOLOv3 and RCNN - practical results
    • Why YOLOv3? Learnings from failures.
  • Seeing RGB - thermal in Action [1-minute demo] [4 minutes]
  • Comparing RGB outputs vs Thermal Outputs. (Let's see some numbers)
  • Challenges faced.
  • Future scope for FireFly (Open-sourced for contribution: [])
  • Key Takeaways, Q&A

What's the buzz around autonomous cars?

  • In recent times, our reliance on automation has been increasing exponentially. This is specifically evident in the automotive industry, where we see an aggressively large number of software components being used in automotive cars and vehicles.
  • This is done with the hope of providing better safety, comfort, and assistance to the driver, while also improving the experience of passengers. In the last few years, there has been extensive growth in Artificial Intelligence (AI) research.
  • The automotive industry has been increasing its use of AI technologies, as these have the potential to enhance driving experiences.

The problem

  • Recognition of thin cross-section objects changing direction (e.g. a cyclist, pedestrian) is still relatively difficult. One would argue- we have state of the art Object detection models and LIDAR sensors!! But they don't solve the real problem.
  • Bicycles are generally considered “the most difficult detection” problem that autonomous vehicle systems face due to their thein cross-section.
  • Unmanned vehicles cannot rely on visible images while navigating in cloudy weather/low sunlight or during the night.

The solution - Content Theory

  • We Use thermal Images!
  • The FLIR dataset: It helps detect and classify pedestrians, bicyclists, animals, and vehicles in challenging conditions like total darkness, fog, smoke, inclement weather, and glare.
  • Detection range: 4x farther than typical headlights.
  • Preprocessing Data: Enhance edges using a Butterworth high pass filter. Handling class imbalance across training and validation sets using the oversampling technique. Narrowing down to only three categories(People, Bicycles, and Cars).
  • FLIR dataset isn't sufficient to train an accurate model! - USe CYCLE GAN to generate more data.
  • Leveraging transfer learning by using pre-trained convolution weights from ResNet pre-trained on FLIR ADAS and Pascal VOC.
  • Visiting the network architecture to understand why YOLOv3 works!
  • Comparing the shortcomings of RGB based object detection with thermal images based detection on real-world videos.
  • Discussion on results and challenges that need to be addressed.
  • The future scope of handling the problem of occlusion can be handled by predicting optical flow/motion trajectories of each agent in the video.

What's in store for you?

The talk will be very interactive and has a lot of fun scenarios to explore. All the math enthusiasts will sure have a lot of fun understanding the proposed network architectures. The scenarios I'll be dealing with will be very relatable and something anyone who drives would have faced. There'll be plenty of practical examples to understand and it will quite an interesting use case to explore for the audience.

Key Takeaways:

  1. The talk will give you a new perspective on working with thermal images and autonomous driving applications. You'll have enough information and methods the to program object detection modules for autonomous cars and will also be able to research the topic further with little or no help.
  2. You'll get to explore new use- cases using GAN's, YOLOv3 - all in Tensorflow.
  3. You'll witness live simulations of Autonomous driving images.


  • Python basics
  • Basic understanding of CNN architectures (ResNet)
  • If you are passionate about learning new technologies and methods in tech, this talk is for you.

Video URL:

Speaker Info:

I am a Data Engineer at, India. I have been working with python for over 3 years now and I am a big time machine Learning aficionado. In the past few years I have worked working with Computer Vision and Music Analysis related search. While I am not working I build use AI and ML based applications for social good and work on building applications at scale while at work. I love participating in hackathons. I am multiple hackathon winner (Microsoft AI hacakthon, Sabre Hack, Amex AI hackathon, Icertis Blockchain and AIML hackathon, Mercedes Benz Digital Challenge) and people often call me "The Hackathon Girl". As a tech enthusiast, I enjoy sharing my knowledge and work with the community. I am a tech speaker(Pyconf Hyd 2019), tech blogger, podcast host(, hackathon mentor at MLH hacks , Technical content creator at Omdena and Global Ambassador at Women.Tech Network I believe in hacking my way through life one bit at a time.

Speaker Links:

Github: laishawadhwa

LinkedIn: laisha-wadhwa

Twitter: laishawadhwa

Medium: laisha.w16_85978


Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: