Audio Fingerprinting and Shazam-ing

Yash Sherry (~yash10)




This talk focuses on the idea of how an audio classification app, like Shazam, works.

The talk would consist of the following sections :

  1. Understanding the basics of music signals and histograms

  2. Algorithms for hashing and music learning

  3. Storing in databases

  4. Testing with various forms of input

  5. Taking a look at the open source softwares in python available.

The primary libraries being used are: numpy,matplotlib,scipy,pyaudio

What we aim by the end of the talk : Understanding the basics of how audio fingerprinting works and how we can develop python apps to recognize and classify tones.


No such requirements. Basic python understanding. The theory aspect would be covered. A setup prerequisite of these following libraries is necessary.

Numpy (for numeric computing) Scipy (for numeric computing) Matplotlib (for visualization) Pyaudio (for audio processing)

Content URLs:

Here is a link to the videos of these projects running :

All videos would be added by 1st July. I would be adding the link to the presentation and the codes too by 2nd July.

Speaker Info:

I am a second year student at IIIT-Delhi, majoring in the field of Computer Science. I have a decent amount of experience in research activities and I am a core member of two Korean research labs , Irisys and Optimede. Apart from this, I have worked as a research intern at Stanford in the domain of Crowd Research under Prof Michael Bernstein in the domain of Data Science.

I am currently working with Carnegie Mellon University in the field of Reinforcement Learning , wherein we have developed an algorithm faster than the current DQN code by Google's DeepMind. We are publishing our idea shortly in a prestigious conference.

Speaker Links:

Gmail : LinkedIn : Project demos : Open Source :

Section: Scientific Computing
Type: Talks
Target Audience: Intermediate
Last Updated: