Python Data Visualization for data scientists (and software developers)
Data visualization is an important ingredient of a data science project workflow. A typical data science endeavor involves data exploration and understanding, data cleaning and transformation, building a machine learning model on the transformed data, gathering, and presentation of the results. Visualization skills are helpful (and often necessary) for efficient execution of each of these steps. The visualization landscape in Python has expanded a lot in last decade, we no longer try to write verbose codes in matplotlib but we have at our disposal high-level tools written for specificity and simplicity. In this talk, I'll discuss the role of data visualization in data science landscape, and describe Python data visualization software tools with examples. I'll begin with grammar of graphics and matplotlib , digging deeper into pandas and seaborn for statistical visualization, and then move to data visualization for web which is de facto dataviz these days for presentations/dashboards/reports. This talk will involve live coding using ipython notebooks.
- Python Programming
- Interest in working with data.
- Basic knowledge of data analysis/science and statistics is useful though not necessary.
Check out the following tools (we will go through these in the talk):
Slides (in progress) - https://drive.google.com/file/d/0B8JxqK_pafNqUE5Qc2VocnNGazg/view?usp=sharing
Janu Verma is a data visualization and machine learning researcher at IBM TJ Watson Research Center in New York, currently on an assignment at IBM India Research Lab in New Delhi. He works in the HealthCare Analytics Research Group. Previously, he was a Quantitative Researcher in the Buckler Lab @ Cornell University, where he worked on the problems in bioinformatics and genomics. He has held research positions at Machine Learning Lab at Kansas State University, Tata Institute of Fundamental Research, Mumbai, and at Jawaharlal Nehru Center for Advanced Scientific Research, Bangalore. His work has appeared in IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals etc. His current focus is on the development of visual analytics systems for prediction and understanding.