+1 -1 +13
Vote on this proposal

Automated data analysis with Python

by Anand S (speaking)

Section
Scientific Computing
Session type
Talk
Technical level
Intermediate

Objective

Given a dataset, is it possible to automatically discover insights?

Can a machine analyse data and identify the most interesting patterns?

This talk walks through the basics of automated analysis techniques in Python that you can apply to most datasets.

Description

The talk will cover:

  1. How to identify the numeric and non-numeric fields in a dataset
  2. The kinds of analyses that can be performed -- e.g. correlation, comparing means, etc
  3. How to identify the "interestingness" of each result, and show only the most interesting ones.

The analysis will entirely by in Python. NumPy and SciPy, in particular. Some samples of functions that'll end up being used are: scipy.stats.pearsonr, scipy.stats.kruskal, scipy.stats.ttest_ind, etc

Speaker bio

Anand is a data scientist at Gramener, a data visualisation company. For more about Anand, visit http://s-anand.net/

Comments


  • 2

    [-] Vikas Kumar Choudhary 876 days ago

    Thanks for nice presentation.. please upload the slides


  • 2

    [-] Praveen Puglia 876 days ago

    loved the presentation..would love to have a full time tutorial from your side :)


    • 1

      [-] yati sagade 872 days ago

      I too, would love to see a blog post series or a video series that covers this fairly in depth :)


  • 1

    [-] Anand B Pillai 910 days ago

    Anand, not sure how the talk is about anything to do with Python, since from the description this looks like a generic data analysis talk. Please give information or links to the Python tools/libraries you are going to mention in this talk.


  • 1

    [-] Anand S 909 days ago

    The analysis'll entirely by in Python, Anand. NumPy and SciPy, in particular. Some samples of functions that'll end up being used are: scipy.stats.pearsonr, scipy.stats.kruskal, scipy.stats.ttest_ind, etc


  • 1

    [-] Anand B Pillai 909 days ago

    Ok - Sounds good. Thanks for clarifying. Kindly update the description to reflect it since evaluators may not look at comments.


  • 1

    [-] software mechanic 873 days ago

    Hi anand, I missed it in the two days confusing competition. Caught 10-15 mins of a late running session and wanted to catch more. Can you put up the code/examples/presentation somewhere online?


  • 1

    [-] Anand S 873 days ago

    The slides are at http://www.slideshare.net/gramener/automated-data-analysis-with-python and the code at https://github.com/sanand0/pyconindia2012-autolysis

Login with Twitter or Google to leave a comment →