+1 -1 +48
Vote on this proposal

Faster data processing in Python

by Anand S (speaking)

Section
Scientific Computing
Technical level
Intermediate

Objective

To understand libraries, techniques and benchmarks to help you speed up your data storage, retrieval and processing.

Description

Working with data in Python requires making a number of choices, ranging from the simple to the complex.

  • Should I use pickle, CSV or JSON? (Ans: CSV).
  • What do I read it with: csv.DictReader or csv.reader? (Ans: Pandas).
  • How should I parse dates? (Ans: Anything but Pandas / dateutil)
  • How do I optimise numpy calculations? (Ans: Learn vector algebra)
  • How do I run a function in parallel?
  • How to make my program restartable?
  • How do I use multiple cores?

This session will explain how to benchmark code and share insights on the patterns of programming that make your application faster.

Requirements

A good working knowledge of Python and the standard libraries

Speaker bio

Anand is the chief data scientist at Gramener. He explores data stories visually with Python and Javascript.

He blogs at http://www.s-anand.net/