Topic Modelling with Python

Parul Sethi (~parulsethi)


4

Votes

Description:

Topic Modelling is a great way to analyse completely unstructured textual data - and with the python NLP framework Gensim, it's very easy to do this. The purpose of this tutorial is to guide one through the whole process of topic modelling - right from pre-processing the raw textual data, creating the topic models, evaluating the topic models, to visualising them. We will also see it’s applications in few NLP tasks: Discovering Topic correlation (with dendrograms), Document Clustering (demo with Tensorboard), Document analysis (using word coloring).

The python packages used during the tutorial will be spaCy (for pre-processing), gensim (for topic modelling), Visdom pyLDAvis and Plotly (for visualization). The interface for the tutorial will be a Jupyter notebook.

Speaker Info:

Parul Sethi

I'm a pythonista studying Maths and IT at University of Delhi. For the love of Open-source and NLP, I regularly contribute to a widely used Python library gensim and has also been selected as their GSoC(Google summer of code) student under NumFOCUS umbrella for 2017 (my live blog).

Chaitali Saini

Speaker Links:

Parul Sethi

Accounts:

github, twitter, linkedin

Blogs:

https://rare-technologies.com/gsoc17-training-and-topic-visualizations/

https://rare-technologies.com/wordrank-embedding-crowned-is-most-similar-to-king-not-word2vecs-canute/

Chaitali Saini

Section: Data Analysis and Visualization
Type: Workshops
Target Audience: Intermediate
Last Updated: