Audio Fingerprinting and Shazam-ing

Yash Sherry (~yash10) | 30 Jun, 2016

41

Votes

Description:

This talk focuses on the idea of how an audio classification app, like Shazam, works.

The talk would consist of the following sections :

Understanding the basics of music signals and histograms
Algorithms for hashing and music learning
Storing in databases
Testing with various forms of input
Taking a look at the open source softwares in python available.

The primary libraries being used are: numpy,matplotlib,scipy,pyaudio

What we aim by the end of the talk : Understanding the basics of how audio fingerprinting works and how we can develop python apps to recognize and classify tones.

Prerequisites:

No such requirements. Basic python understanding. The theory aspect would be covered. A setup prerequisite of these following libraries is necessary.

Numpy (for numeric computing) Scipy (for numeric computing) Matplotlib (for visualization) Pyaudio (for audio processing)

Content URLs:

Here is a link to the videos of these projects running : https://www.youtube.com/channel/UCcedwpxpmWggI_Wcfx3ijxQ

All videos would be added by 1st July. I would be adding the link to the presentation and the codes too by 2nd July.

Speaker Info:

I am a second year student at IIIT-Delhi, majoring in the field of Computer Science. I have a decent amount of experience in research activities and I am a core member of two Korean research labs , Irisys and Optimede. Apart from this, I have worked as a research intern at Stanford in the domain of Crowd Research under Prof Michael Bernstein in the domain of Data Science.

I am currently working with Carnegie Mellon University in the field of Reinforcement Learning , wherein we have developed an algorithm faster than the current DQN code by Google's DeepMind. We are publishing our idea shortly in a prestigious conference.

Speaker Links:

Gmail : yash14123@iiitd.ac.in LinkedIn : https://www.linkedin.com/in/yash-sherry-63ab8aaa Project demos : https://www.youtube.com/channel/UCcedwpxpmWggI_Wcfx3ijxQ Open Source : https://github.com/theaverageguy/

Section:	Scientific Computing
Type:	Talks
Target Audience:	Intermediate
Last Updated:	01 Jul, 2016

Comments