Voice Cloning using Deep Learning
Kurian Benoy (~kurianbenoy) |
Calling your mother on your voice and talking on anything I want. How frightening can it be? Using new advances in Deep Learning, we can do that, from transfer Learning from Speaker verification to Multispeaker Text-to speech Synthesis(SV2TTS) with a vocoder that works in real-time.
SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices. I will be talking about similar architectures for Voice cloning and how it works. If I can implement this research paper by that time, I will do a live demo of generating voice from given text during Poster presentation time.
absolutely nothing :P
Kurian Benoy is an open source contributor at CloudCV, DVC. He is the lead organiser of School of AI, Kochi and is an AI enthusiast working on Deep Learning and Computer Vision. Kurian is FOSSASIA Open TechNights WInner and gave a talk in FOSSASIA Open Tech submit about the [keralarescue.in team] (https://www.youtube.com/watch?v=2RzImb5JwMA).