Dimensionality Reduction and Principal Component Analysis

Joydeep Bhattacharjee (~infinite-Joy)


3

Votes

Description:

Normally when we are applying any of the machine learning concepts, we need to deal with a lot of matrices. Each matrix may have a lot of features or dimensions and then we will need to do a lot of computation. It may be prohibitive to run all the computations in a production environment, not counting the added problem of overfitting. In many occasions it is also very useful to visualise the data. Due to our limitations as human beings, we are not able to visualise higher dimensions. For these reasons we need to resort to Principal Component Analysis or PCA to reduce the dimensions in our data-set. In this talk you will learn

  • What is Principal Component Analysis and why you should be interested in this?
  • The math behind principal component analysis and why it works the way its supposed to work?
  • How to select principal components?
  • Implementing this in production using sklearn

Additional and Optional(if time permits)

  • How to plugin PCA to an existing production application.

Prerequisites:

Knowledge on

  • Matrices and Matrix Multiplication
  • Pandas
  • Numpy
  • Sklearn
  • Bokeh
  • Simple Prediction Algorithms like linear regression

Content URLs:

  • https://docs.google.com/presentation/d/129s88g8tynuzN-IqNxBeKEsqie-7rn15FlYHbMZ4aSo/edit?usp=sharing
  • https://gist.github.com/infinite-Joy/b56e914aba76b427829328a5313cb290
  • https://gist.github.com/infinite-Joy/a7cc8a9975f12b33a896eb6c3412448d

Speaker Info:

Hello, I am a software engineer/data scientist working for a consulting firm called Nineleaps. Currently I am working on a project where we are trying to apply machine learning algorithms to various medical problems and the pharmaceutical industry at large. I also have a podcast on various developer topics called Flawcode. I love talking about machine learning and software engineering and you can send me a hi at @alt227Joydeep.

Speaker Links:

  • https://medium.com/@joydeepubuntu
  • https://flawcode.com/
  • https://github.com/infinite-Joy

Section: Data Analysis and Visualization
Type: Talks
Target Audience: Intermediate
Last Updated: