Fine-Tuning BERT for State-of-the-ART Transfer Learning in Text using Python
Kumar Nityan Suman (~nityansuman) |
One of the biggest challenges in natural language processing (NLP) is the shortage of training data. Because NLP is a diversified field with many distinct tasks, most task-specific datasets contain only a few thousand or a few hundred thousand human-labeled training examples. However, modern deep learning-based NLP models see benefits from much larger amounts of data, improving when trained on millions, or billions, of annotated training examples. To help close this gap in data BERT comes to the rescue.
BERT or Bidirectional Encoder Representations from Transformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks.
What Makes BERT Different?
BERT builds upon recent work in pre-training contextual representations and establishes a new State-of-the-Art in several standard NLP tasks such as
Sentiment Analysis, and so on.
Why does this matter?
Pre-trained representations can either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. BERT uses a
bidirectional contextual representation, which is the most powerful combination of all.
Importantly, BERT achieved all of its results with almost no task-specific changes to the neural network architecture. It improves
State-of-the-Art results on more than 10 benchmark datasets in different NLP tasks.
Making BERT Work for You
The model here we build will help can be utilize in multiple different tasks from
Aspect-Based Sentiment Classification to
Named Entity Recognition tasks by just fine-tuning on a small or descent size data in a few hours or less.
The aim of this workshop is to get up and started with the current State-of-the-Art transfer learning technique in natural language processing.
Understanding and getting started with the current
State-of-the-Art transfer learning technique in natural language processing (NLP) with tensorflow is the need of the hour with constraints on availability of datasets.. The architecture learned here can be used with no or minimal changes for multiple different tasks in NLP domain by just fine-tuning it on the target dataset in a few hours or less.
- Basic understanding of Python
- Intermediate understanding of concepts in
Natural Language Processingsuch as
- Prior experience in coding with
- Basic understanding of
BERT from the original paper
Nityan is a seasoned Data Scientist at Youplus Inc. with sound experience in solving real world business problems across domains of Natural Language Processing and Computer Vision.
He has been working in the domain of natural language processing for quite some time now and has experience of more than a dozen NLP and Computer Vision projects ranging from Text Classification, Text Similarity, Anaphora Resolution to advanced topics such as Aspect Based Sentiment Classification, Question-Answering.
He is a data science evangelist. He mentors university level teams for state and national level hackathons. He himself have won the prestigious MLADS hackathon organized by Microsoft India. He also maintains a couple of Open-Source projects on the side for fun.