Playing around with text data




This talk will introduce the developers to techniques and python libraries for text processing.

Tools and Libraries

  • NLTK
  • SpaCy
  • Gensim

Outline of the talk

  • Challenges with text data
  • Preprocessing approaches
  • POS Tagging
  • Vectorization - Word2Vec, Doc2Vec, GloVe
  • Tips for beginners


Basics of Text Processing, Mathematics

Content URLs:

Gensim SpaCy

Speaker Info:

S. Dhanya Abhirami is a Final year CS Undergraduate student at Vellore Institute of Technology (VIT), Vellore. She is a self-motivated and passionate coder with experience in Python, C++ and JavaScript. She has worked on projects on sentiment analysis, parallel computing and cryptography. She interned at Visa Inc, Bengaluru in the summers of 2019. As an active member of ACM Chapter of VIT, she has conducted Python Workshop for freshers. She gave lightning talk at PySangamam 2018. She is the recipient of various awards like SRM Thought Leadership Scholarship and University Merit Scholarship. She also takes part in competitive coding challenges.

Speaker Links:

Github LinkedIn Medium

Id: 1435
Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: