Audio Fingerprinting and Shazam-ing

Yash Sherry (~yash10)


41

Votes

Description:

This talk focuses on the idea of how an audio classification app, like Shazam, works.

The talk would consist of the following sections :

  1. Understanding the basics of music signals and histograms

  2. Algorithms for hashing and music learning

  3. Storing in databases

  4. Testing with various forms of input

  5. Taking a look at the open source softwares in python available.

The primary libraries being used are: numpy,matplotlib,scipy,pyaudio

What we aim by the end of the talk : Understanding the basics of how audio fingerprinting works and how we can develop python apps to recognize and classify tones.

Prerequisites:

No such requirements. Basic python understanding. The theory aspect would be covered. A setup prerequisite of these following libraries is necessary.

Numpy (for numeric computing) Scipy (for numeric computing) Matplotlib (for visualization) Pyaudio (for audio processing)

Content URLs:

Here is a link to the videos of these projects running : https://www.youtube.com/channel/UCcedwpxpmWggI_Wcfx3ijxQ

All videos would be added by 1st July. I would be adding the link to the presentation and the codes too by 2nd July.

Speaker Info:

I am a second year student at IIIT-Delhi, majoring in the field of Computer Science. I have a decent amount of experience in research activities and I am a core member of two Korean research labs , Irisys and Optimede. Apart from this, I have worked as a research intern at Stanford in the domain of Crowd Research under Prof Michael Bernstein in the domain of Data Science.

I am currently working with Carnegie Mellon University in the field of Reinforcement Learning , wherein we have developed an algorithm faster than the current DQN code by Google's DeepMind. We are publishing our idea shortly in a prestigious conference.

Speaker Links:

Gmail : yash14123@iiitd.ac.in LinkedIn : https://www.linkedin.com/in/yash-sherry-63ab8aaa Project demos : https://www.youtube.com/channel/UCcedwpxpmWggI_Wcfx3ijxQ Open Source : https://github.com/theaverageguy/

Section: Scientific Computing
Type: Talks
Target Audience: Intermediate
Last Updated: