Haptic Learning: Inferring anatomical features using Deep Networks
Akshay Bahadur (~akshaybahadur21) |
For providing haptic feedback, users have been dependent on external devices including buttons, dials, stylus, or even touch screens. The advent of machine learning along with its integration with computer vision has enabled users to efficiently provide inputs and feedback to the system. A machine learning model consists of an algorithm that draws some meaningful correlation between data without being tightly coupled to a specific set of rules. It is crucial to explain the subtle nuances of the network also the use-case we are trying to solve. The main question, however, is to discuss the need to eliminate an external haptic system and use something which feels natural and inherent to the user. To connect the dots, we will talk about the development of applications specifically aimed to localize and recognize human features which could then, in turn, be used to provide haptic feedback to the system. These applications will range from recognizing digits, alphabets which the user can 'draw' at runtime; developing state of the art facial recognition system; predicting hand emojis along with Google's project of 'Quick, Draw' of hand doodles. First, we will start with formulating and addressing a strong problem statement followed by a thorough literature review. Once these things are taken care of, we will discuss the data gathering part, followed by the algorithm evaluation and future scope.
The presentation will have code excerpts for the MNIST Digit Recognition and how could we use computer vision technique to ‘draw’ digits on the screen. Using the same technique, we would also look at Quick, Draw implementation. In this, the neural network will try to recognize doodles from drawing. Subsequently, we would discuss Emojinator and the idea behind it, which will be followed by a code walkthrough and future scope for the project. Next, we would try to detect the drowsiness of the driver, which has a very strong use-case in the automobile industry. Subsequently, I would demonstrate Facial Recognition and the research paper behind the idea. Lastly, I would like to demonstrate inferencing Indian Sign Language using semantic segmentation and facial key points tracking. While giving each of the demos, I would be talking about the model used. How to get the data and how to pre-process it. How and why to use transfer learning. Why is literature review the most important phase of your project? How contributing to the community helps you ultimately. At the end of the session, the audience will have a clearer look at how to start building real-world projects on their own.
- MNIST [10 mins]
- Quick, Draw (Google) [10 mins]
- Emojinator [10 mins]
- Drowsiness Detection [5 mins]
- Facial Recognition [5 mins]
- SignNet [5 mins]
- OpenPose [5 mins]
Target audience and outcome
This tutorial is aimed at machine learning practitioners who have relevant experience in this field with basic understanding of neural networks and image processing would be highly appreciated. By the end of the session, the audience will have a clearer understanding of building vision based optimized models that can be run on low resources. In a developing country like India, the crux of the problem lies with the requirement of heavy resources for performing computation. With the help of this tutorial, I want to share my insight on developing learning models frugally and efficiently.
- Basic understanding and Coding experience in Python
- Basic understanding of Machine Learning
- Basic concepts of image processing
Akshay Bahadur’s interest in computer science sparked when he was working on a women's safety application aimed towards the women's welfare in India and since then he has been incessantly tackling social issues in India through technology. He is currently working alongside Google to make an Indian sign language recognition system (ISLAR) specifically aimed at running on low resource environments for developing countries. His ambition is to make valuable contributions towards the ML community and leave a message of perseverance and tenacity.
He’s one out of 8 Google Developers Expert (Machine Learning) from India along with being one of 150 members worldwide for the Intel Software Innovator program.
- Invited by Google to present my research on ISLAR (Indian Sign Language) at Google headquarters in California.
- Received "The Most Influential Young Data Scientist of the Year 2019" by The International Society of Data Scientists for my contributions in the field of Machine Learning.
- Tutorial acceptance at the IEEE Winter Conference on Applications of Computer Vision (WACV 2020) - “Minimizing CPU utilization for Deep Networks”.
- Delegate at the 2020 Harvard College Conference in Cambridge, Massachusetts, USA.
- Awarded Top Innovator Award (2019) by Intel.
- Contributed to Google’s open-source project (Quick, Draw) and NVIDIA’s open source project (Autopilot).
- Presented my work along with the Google TensorFlow team at TensorFlow Roadshow (Bangalore).
Presenting author details
- GDE Summit, California 2019
- Data Hack Summit 2019
- GDG DevFest Kokata 2019
- Open Data Science Conference (ODSC), India 2019
- Indian Institute of Science, 2019
- Open Data Science Conference (ODSC), Boston 2019
- Open Data Science Conference (ODSC), India 2018
- DeepCogntion Workshop
- Institute of Analytics [Part 1] [Part 2]
- Microsoft Advanced Analytics User group