Decoding Bollywood with Python and data science

Marc Garcia (~marc)




"Our Business Is Our Business None Of Your Business…"

Yes, they wish, but we want to know everything about Bollywood!

  • Who is more popular, Katrina Kaif or Deepika Padukone?
  • What movie is the most similar to PK based on the storyline?
  • Which city in India is home of the most active actresses and actors?

And lots of other questions. Do you want to know the answers? And even better, would you like to discover them yourself by using Python and popular libraries such as pandas, Gensim and scikit-learn? And cutting-edge data science techniques? Join us for a workshop full of insights where you will be able to answer your own questions while learning the most advanced Python libraries and algorithms.

The workshop is designed for Python programmers new to data science. Everybody is welcome, but data analysts and people experienced with pandas will find some parts basic.

What will we cover?

  • Loading, merging, cleaning and analysing your data with pandas
  • Advanced data visualisation with Bokeh
  • Embeddings and natural language processing with Gensim
  • Basic machine learning with scikit-learn

All this while answering the questions above, and letting you answer your own questions.


  • Laptop with Anaconda3 installed
  • Clone of the workshop repository
  • Knowledge of Python
  • Good knowledge of Bollywood desirable :)

Speaker Info:

Marc Garcia is a Python fellow and pandas core developer. He has worked as software engineer and data scientist for companies like Bank of America, Tesco, Unilever or NTT Communications. He is a regular organiser of sprints, and speaker at PyCon and PyData conferences. His favourite actor is Aamir Khan, but wouldn't mind teaching Python to Asin.

Himanshu Awasthi is the organiser of Kanpur Python and PyData Kanpur. Free and open source software enthusiast, and passionate about Python and data analysis, He is currently working for KanpurFOSS organization which organize free technical workshops in India. Yai Workshop… Data Analysis Ke Workshop Hai… Kisi Ke Data Analysis sikha kar He Khatam Hoge...

Id: 762
Section: Data science
Type: Workshops
Target Audience: Beginner
Last Updated: