Dimensionality Reduction and Principal Component Analysis
Joydeep Bhattacharjee (~infinite-Joy) |
Normally when we are applying any of the machine learning concepts, we need to deal with a lot of matrices. Each matrix may have a lot of features or dimensions and then we will need to do a lot of computation. It may be prohibitive to run all the computations in a production environment, not counting the added problem of overfitting. In many occasions it is also very useful to visualise the data. Due to our limitations as human beings, we are not able to visualise higher dimensions. For these reasons we need to resort to Principal Component Analysis or PCA to reduce the dimensions in our data-set. In this talk you will learn
- What is Principal Component Analysis and why you should be interested in this?
- The math behind principal component analysis and why it works the way its supposed to work?
- How to select principal components?
- Implementing this in production using sklearn
Additional and Optional(if time permits)
- How to plugin PCA to an existing production application.
- Matrices and Matrix Multiplication
- Simple Prediction Algorithms like linear regression
Hello, I am a software engineer/data scientist working for a consulting firm called Nineleaps. Currently I am working on a project where we are trying to apply machine learning algorithms to various medical problems and the pharmaceutical industry at large. I also have a podcast on various developer topics called Flawcode. I love talking about machine learning and software engineering and you can send me a hi at @alt227Joydeep.