Objective
To understand libraries, techniques and benchmarks to help you speed up your data storage, retrieval and processing.
Description
Working with data in Python requires making a number of choices, ranging from the simple to the complex.
- Should I use pickle, CSV or JSON? (Ans: CSV).
- What do I read it with: csv.DictReader or csv.reader? (Ans: Pandas).
- How should I parse dates? (Ans: Anything but Pandas / dateutil)
- How do I optimise numpy calculations? (Ans: Learn vector algebra)
- How do I run a function in parallel?
- How to make my program restartable?
- How do I use multiple cores?
This session will explain how to benchmark code and share insights on the patterns of programming that make your application faster.
Requirements
A good working knowledge of Python and the standard libraries
Speaker bio
Anand is the chief data scientist at Gramener. He explores data stories visually with Python and Javascript.
He blogs at http://www.s-anand.net/