Cutting edge NLP classifiers in one hour with Python and fastText
Joydeep Bhattacharjee (~infinite-Joy) |
FastText has been open-sourced by Facebook in 2016 and with its release, it became the fastest and most cutting edge library in Python for text classification and word representation. It is to be seen as a substitute for gensim package's word2vec. It includes the implementation of two extremely important methodologies in NLP i.e Continuous Bag of Words and Skip-gram model. Fasttext performs exceptionally well with supervised as well as unsupervised learning.
The tutorial will be divided in following four segments :
0-10 minutes: The talk will begin with explaining common paradigms that are present right now. Are deep learning really necessary?
10-15 mins: what are word representations
15-25 minutes: The code will be shown and explained line by line for both the models (CBOW and Skip-gram) on a standard textual labelled dataset. Showing how you can do fast prototyping with minimal code.
25-30: How to use the pre-trained word embeddings released by FastText on various languages and where to use them. Why python3 is the best language for multi-language support and a note on general deep learning using fasttext.
30-40 minutes: For QA session.
- Basic python knowledge.
- Some Knowledge on common NLP techniques.
Joydeep is a machine learning engineer/python developer and is a Principal Engineer at Nineleaps. 5 years back he saw the Zen of Python, fell in love with Python and has been in love with it since then. Apart from his day to day work is involved in blogging and podcasting on medium and flawcode. Teaching is another passion of his and he is a python/ML trainer at tecmax.
- Medium: https://medium.com/@joydeepubuntu/latest
- Github : https://github.com/infinite-Joy
- LinkedIn : https://www.linkedin.com/in/joydeep-bhattacharjee-934a1157/
- Machine Learning Podcast: https://flawcode.com/episode/show/12
- twitter: https://flawcode.com/episode/show/12