Audio Fingerprinting and Shazam-ing
Yash Sherry (~yash10) |
This talk focuses on the idea of how an audio classification app, like Shazam, works.
The talk would consist of the following sections :
Understanding the basics of music signals and histograms
Algorithms for hashing and music learning
Storing in databases
Testing with various forms of input
Taking a look at the open source softwares in python available.
The primary libraries being used are: numpy,matplotlib,scipy,pyaudio
What we aim by the end of the talk : Understanding the basics of how audio fingerprinting works and how we can develop python apps to recognize and classify tones.
No such requirements. Basic python understanding. The theory aspect would be covered. A setup prerequisite of these following libraries is necessary.
Numpy (for numeric computing) Scipy (for numeric computing) Matplotlib (for visualization) Pyaudio (for audio processing)
Here is a link to the videos of these projects running : https://www.youtube.com/channel/UCcedwpxpmWggI_Wcfx3ijxQ
All videos would be added by 1st July. I would be adding the link to the presentation and the codes too by 2nd July.
I am a second year student at IIIT-Delhi, majoring in the field of Computer Science. I have a decent amount of experience in research activities and I am a core member of two Korean research labs , Irisys and Optimede. Apart from this, I have worked as a research intern at Stanford in the domain of Crowd Research under Prof Michael Bernstein in the domain of Data Science.
I am currently working with Carnegie Mellon University in the field of Reinforcement Learning , wherein we have developed an algorithm faster than the current DQN code by Google's DeepMind. We are publishing our idea shortly in a prestigious conference.
Gmail : email@example.com LinkedIn : https://www.linkedin.com/in/yash-sherry-63ab8aaa Project demos : https://www.youtube.com/channel/UCcedwpxpmWggI_Wcfx3ijxQ Open Source : https://github.com/theaverageguy/