Fine-Tuning BERT for State-of-the-ART Transfer Learning in Text using Python

Kumar Nityan Suman (~nityansuman)


One of the biggest challenges in natural language processing (NLP) is the shortage of training data. Because NLP is a diversified field with many distinct tasks, most task-specific datasets contain only a few thousand or a few hundred thousand human-labeled training examples. However, modern deep learning-based NLP models see benefits from much larger amounts of data, improving when trained on millions, or billions, of annotated training examples. To help close this gap in data BERT comes to the rescue. BERT or Bidirectional Encoder Representations from Transformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks.

What Makes BERT Different?

BERT builds upon recent work in pre-training contextual representations and establishes a new State-of-the-Art in several standard NLP tasks such as Question-Answering, Sentence-Pair Classification, Sentiment Analysis, and so on.

Why does this matter?

Pre-trained representations can either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. BERT uses a bidirectional contextual representation, which is the most powerful combination of all.

Performance ?

Importantly, BERT achieved all of its results with almost no task-specific changes to the neural network architecture. It improves State-of-the-Art results on more than 10 benchmark datasets in different NLP tasks.

Making BERT Work for You

The model here we build will help can be utilize in multiple different tasks from Question Answering, Text Classification, Aspect-Based Sentiment Classification to Named Entity Recognition tasks by just fine-tuning on a small or descent size data in a few hours or less.

Key Takeaways:

The aim of this workshop is to get up and started with the current State-of-the-Art transfer learning technique in natural language processing.

Understanding and getting started with the currentState-of-the-Art transfer learning technique in natural language processing (NLP) with tensorflow is the need of the hour with constraints on availability of datasets.. The architecture learned here can be used with no or minimal changes for multiple different tasks in NLP domain by just fine-tuning it on the target dataset in a few hours or less.


  • Basic understanding of Python
  • Intermediate understanding of concepts in Natural Language Processing such as Contextual Embeddings, Attention.
  • Prior experience in coding with TensorFlow or Keras.
  • Basic understanding of Neural Networks.

Content URLs:

BERT from the original paper


Speaker Info:

Nityan is a seasoned Data Scientist at Youplus Inc. with sound experience in solving real world business problems across domains of Natural Language Processing and Computer Vision.

He has been working in the domain of natural language processing for quite some time now and has experience of more than a dozen NLP and Computer Vision projects ranging from Text Classification, Text Similarity, Anaphora Resolution to advanced topics such as Aspect Based Sentiment Classification, Question-Answering.

He is a data science evangelist. He mentors university level teams for state and national level hackathons. He himself have won the prestigious MLADS hackathon organized by Microsoft India. He also maintains a couple of Open-Source projects on the side for fun.

Speaker Links:




Id: 1117
Section: Data Science, Machine Learning and AI
Type: Workshop
Target Audience: Advanced
Last Updated: