Exploratory Data Analysis using pandas ,matplotlib
Over the years, machine learning has been on the rise. It is so powerful that it almost tempt us to skip the Exploratory Data Analysis phase. It is not a very good idea to just feed data into a black box and wait for the results. Exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.Pandas is a Python library that provides extensive means for data analysis.In conjunction with Matplotlib, Pandas provides a wide range of opportunities for visual analysis of tabular data.
Following will be taken in workshop:-
- Data Wrangling :- Due to improper format of data,the accuracy of model can be drastically reduced, therefore data wrangling plays an important role
a. Viewing data in a proper manner(which includes shape,basic stats)
b. Knowing datatypes of attributes and pattern of null values.
c. Dealing with missing values using replace values, drop null values, filling null values(various methods of pandas)
d. Converting categorical data into numeric data
e. Normalisation of values (ex. taking log to reduce skewness)
f. Bumming of Values
Visualisation using Matplotlib which includes Barplot, Scatterplot, Histogram, etc
Finding Co-relation for feature selection.
Basic knowledge of python.
I am Purva Chaudhari ,3rd year student of computer science and engineering from Government Engineering College ,Aurangabad.I have a bit knowledge of Big-data technologies such as Hadoop,hive,spark etc.I have started python from last 2 months as I'm interested in Data Analytics and Data Science.