Rise of Transfer Learning in NLP
Kartik (~Kartikaggarwal98) |
The nature of the human language itself makes it difficult for any machine learning model to understand the text. But the past two years have been a big turning point for NLP. Pre-trained Models like ELMo, GPT, BERT, XLNet have made headlines by achieving SOTA in a wide range of NLP tasks. This begins a new era as anyone involved in language processing can now use these models trained on massive datasets to transfer knowledge and build powerful components thereby saving time and resources. This is even better than transfer learning in Computer Vision as NLP does not even require labelled data for pre-training. In this talk, I will give a big picture understanding of modern NLP methods and how NLP transitioned from traditional word vectors to generalized Language models.
The goal of this talk is to:
- Provide a broad overview of the Transfer learning methods in NLP.
- Discuss the latest milestones in NLP as of Mid-2019
The talk is for anyone who:
- Is a beginner in NLP and wondering how does transfer learning work for NLP.
- Wants to get an idea of how models like ELMo, BERT, XLNet etc. work and achieve promising results.
Agenda of the Talk:
- Understand Transfer Learning: What and Why? ( 5 mins )
- Word Vectors/Embeddings: Visualization ( 5 mins )
- Contextual Word Vectors: CoVe ( 3 mins )
- Language Modelling: ELMo, ULMFit, GPT, BERT, XLNet ( 10 mins )
Kartik Aggarwal is currently a final year undergrad at NSIT, Delhi. He is a Research Assistant at DRDO, India under supervision of Dr. Gurjit Singh Walia. Simultaneously, he is working as an Instructor at CampK12 teaching machine learning to high school and undergrad students.
This year he got his paper published in NAACL 2019 working with MIDAS Lab, IIIT Delhi advised by Dr. Rajiv Ratn Shah and Dr. Debanjan Mahata. His research interests lie in Natural Language Processing, Computer Vision, Machine Learning and Information Fusion.