Topic Modelling with Python

Parul Sethi (~parulsethi)




Topic Modelling is a great way to analyse completely unstructured textual data - and with the python NLP framework Gensim, it's very easy to do this. The purpose of this tutorial is to guide one through the whole process of topic modelling - right from pre-processing the raw textual data, creating the topic models, evaluating the topic models, to visualising them. We will also see it’s applications in few NLP tasks: Discovering Topic correlation (with dendrograms), Document Clustering (demo with Tensorboard), Document analysis (using word coloring).

The python packages used during the tutorial will be spaCy (for pre-processing), gensim (for topic modelling), Visdom pyLDAvis and Plotly (for visualization). The interface for the tutorial will be a Jupyter notebook.

Speaker Info:

Parul Sethi

I'm a pythonista studying Maths and IT at University of Delhi. For the love of Open-source and NLP, I regularly contribute to a widely used Python library gensim and has also been selected as their GSoC(Google summer of code) student under NumFOCUS umbrella for 2017 (my live blog).

Chaitali Saini

Speaker Links:

Parul Sethi


github, twitter, linkedin


Chaitali Saini

Section: Data Analysis and Visualization
Type: Workshops
Target Audience: Intermediate
Last Updated:

Hi can you please upload the slides for the talk so that your proposal can be reviewed.

Pradhvan Bisht (~cyber_freak)

Login to add a new comment.