Natural Text to Speech for Indic languages

Muru Selvakumar (~muru_selvakumar)




Natural Text to Speech (TTS) is one of stepping stone in making a language ready for next revolution that await us. Conversational User Interface not only requires the computer to understand speech but also to generate it.

Existing methods construct speech waves with hand-tuned parameters which is a complicated process and often sound robotic and unnatural. With Deep learning we have the ability to train models that synthesize speech with more natural acoustic feeling.

Models like WaveNet, Tacotron and DeepSpeech already produce beautiful voice from English text. We strive to achieve the same for indic languages.


We will try our best to make the content accessible to everyone but having following experience would be a much helpful.


  • BasicPython programming
  • Familiarity with Deep learning concepts and Pytorch framework.


  • Familiarity with audio encoding like MFCC
  • Experience with Speech models

Speaker Info:

Selva Kumar:

He is interested in doing a bit of painting and 3d modeling. Passionate about linguistics and languages & culture in general. He is interested in AGI. He is also a proud, free software evangelist.

He has four publications.

  • An Attentive Sequence Model for Adverse Drug Event Extraction from Biomedical Text
  • Compositional Attention Networks for Interpretability in Natural Language Question Answering.
  • MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network.
  • Detecting Parking Spaces in a Parcel using Satellite Images

Speaker Links:

Selva Kumar:


Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: