Mozilla's DeepSpeech and Common Voice projects

Shagufta Gurmukhdas (~ShaguftaMethwani)




The human voice is becoming an increasingly important way of interacting with devices, but current state of the art solutions are proprietary and strive for user lock-in. Mozilla’s DeepSpeech and Common Voice projects are there to change this. In contrast to classic STT approaches, DeepSpeech features a modern end-to-end deep learning solution. Based on Baidu's Deep Speech research paper, it trains a model by machine learning techniques. This model directly translates raw audio data into text - without any domain specific code in between. To train systems like DeepSpeech, an extremely large amount of voice data is required. Most of the data used by large companies isn’t available to the majority of people. That's why Mozilla launched Common Voice, a project to help make voice recognition open to everyone.

Speaker Info:

I am a deep learning enthusiast and have been exploring it since the past year and it has indeed been the first time technology has made me feel so excited ever since I came to know about the internet. Other than that, I am the initiator and organizer of Django Girls Pune, and a Mozilla TechSpeaker. I am also a decent artist, and love to play the piano in my free time!

Speaker Links:

Mozilla Research machine learning home page:

Speaker's LinedIn:

Speaker's twitter:

Id: 684
Section: Data science
Type: Talks
Target Audience: Beginner
Last Updated: