Comprehensive Study of Distance Metric Learning in Nearest Neighbor Algorithm

Kousik Krishnan (~kousik)


4

Votes

Description:

Nearest Neighbour(NN) algorithm, which is a lazy and a non-parametric method used for classification is one of the most intuitive and widely used machine learning algorithms. It is most often sought by business consultants for its simple and easy to understand framework. The performance of the algorithm can be enhanced by optimally tuning its hyper-parameters, which includes the k-value and the distance metric. However, practitioners tend to focus only on optimising k and ignores the other. The very term "nearest-neighbour" means that we employ some notion of near, i.e. we use some distance metric to quantify similarity and thus define neighbours. This emphasises the importance of the Distance Metric in the NN algorithm. In this talk, we present some of the novel approaches used, to learn the distance metric from the training data. Also, we demonstrate how slight amendments to the approach can lead to an inception of a dimensionality reduction technique. The above mentioned approaches are bundled together as a python package and is showcased to the audience.

Structure of the Talk:

1. An overview of K-NN algorithm
2. Theory of Distance metrics
    2.1 Mathematical definition of a metric
    2.2 Some common distance metrics
3. Deep dive into Metric Learning techniques
    3.1 Why is it important?
    3.2 The math behind metric learning 
    3.3 Application in Dimensionality Reduction
4. Implementation using some popular datasets

Prerequisites:

Basic programming skills in python, machine learning(familiarity with common classification and dimensionality reduction techniques) and linear algebra.

Speaker Info:

Kousik is pursuing his undergraduate studies at Chennai Mathematical Institute and shows immense interest in Machine Learning and Finance. He has contributed to multiple open source projects and has interned with the research and development teams of various organisations. His primary research interests include computer vision, graph based machine learning algorithms and quantitative finance. He has also involved in different technical talks at IIT-M and is one of the members of the Chennai Python Meetup group.

Section: Data science
Type: Talks
Target Audience: Intermediate
Last Updated: