Feature Engineering for Kaggle and Machine Learning Competitions

Mohammad Shahebaz (~shaz13)


28

Votes

Description:

With advancements in machine learning and artificial neural networks, the answers to previously unknown questions are surfaced. It is the data and the feature engineering that makes this A.I and ML a great hype of the 21st century. Albeit the algorithm being complex and extraordinary at solving a task there is always need of crunching the numbers right with feature engineering that help model understand the trend and classes better. This proposal shall cover the feature engineering for competitive machine learning problems at platforms like Kaggle, Analytics Vidhya, and HackerEarth. Additionally, this will cover a case study of a winning solution and the inferences from other competitions.

Prerequisites:

  1. Python
  2. Pandas
  3. Scikit-learn

Content URLs:

The talk will cover the following topics.

  • What difference feature engineering can make?
  • Feature Engineering using Python
    • Numerical techniques
    • Categorical techniques
    • Variable Interactions
    • Decomposition techniques
    • Using google scholar and domain knowledge
  • Case studies on winning competitions
  • Live problem-solving at Kaggle Competition

Material links -

enter image description here

Speaker Info:

Speakers : Sudarshan Gadhave and Mohammad Shahebaz

Sudarshan Gadhave is a Data Science ,Data Engineering & Data Integration professional with over 8 years of experience working on Machine Learning , Data Engineering , Data Visualization and Data Warehousing Projects. Currently, he is working as a Specialist Data Scientist in Analytics R&D team of Nice Actimize ( Nice Systems) working on developing Anomaly & Fraud detection models. Earlier experience of working in Advanced Analytics & Data Warehousing teams of NEC, Japan & John Deere (Deere & Company). Pythonista & expert in Python Machine learning stack (Numpy, Pandas, Scikit-Learn, Matplotlib) Active & Core member of Python Pune meetup group. Presented several talks on Python & machine learning in meetups, conferences and colleges all over Pune.

Mohammad Shahebaz is a data scientist intern at Analytics Vidhya. He is also India's finalist in Microsoft World Championship 2013, the finalist at Master Orator Champion 2016, and has bagged a regional gold medal in International Maths Olympiad (IMO). Currently pursuing out the latest trends in Machine Learning and Artificial Intelligence while winning a competitive position at National level competitions and on Kaggle platform. He loves open-source and have contributed to organizations like Google Web Fundamentals, Scikit Learn, FOSSASIA and is serving as Social Committee Lead at Oppia.org in Google Summer of Code. On a path to set machine learning and artificial intelligence to Indian masses, he open-sources his code and approaches at GitHub and organization MLBYTE.

Speaker Links:

Sudarshan Gadhave

  1. Github:- https://github.com/sudarshan1413
  2. Linkedin:- https://www.linkedin.com/in/sudarshan-gadhave-73567b23/

Mohammad Shahebaz

  1. Shahebaz LinkedIn Profile
  2. Shahebaz GitHub Profile
  3. Rank 2 at Analytics Vidhya overall leaderboard
  4. Kaggle Profile

Mentions 1. Master Orator Champion 2. 1st runner-up of TechGig Machine Learning Hackathon - June 8, 2018

Section: Data science
Type: Talks
Target Audience: Beginner
Last Updated: