Unboxing Decoder-only Transformers for Text Generation via Python

Nitin Aggarwal (~nitin46)




Hello Python Fans,

Large Language Models has already swiped every business by storm. Machines can finally talk. When I decided that it was time to learn about the great transformer model, my first question was, why was it doing it so well? What was it that was making them such a strong models that machines were actually talking? Being a mathematician, programmer, developer, data scientist, machine learning engineer, AI scientist all combined in one, I decided it was a great Idea to unbox the large language model and understand in depth why they work. That is why I made myself look at every step of the model special the decoder-only transformers for text generation and wrote a series of three articles on LinkedIn which can be accessed through following links:

  1. Why I love transformers?
  2. Understanding why Text Generators work using Decoder-only Transformers
  3. Unboxing the Decoder-only Transformers for Text Generators

What I learned here help me modify the text encoders quite a bit to optimize them without breaking them. I want to do this workshop with a dream to have every python enthusiast understand how these models work so that they can build them on their own to train our own languages. I am trying to form this dream to a community as well so that large language models be trained for Indian languages.

What I intend to achieve from this workshop? 1. Have the participants understand just why these decoder-only text generators makes chatgpt so fluent in speaking English. 2. Help you write a package for Decoder-only Text Generator from scratch view PyTorch. 3. Help you plan out a training of the model based on certain books having very small vocabulary. (I will try to share a live example of this before the workshop) 4. Provide you a brief review of the major research developments that happened after chatgpt changed the era.

I have very high hopes from this workshop, and am looking forward to help build an AI force out of python experts in India dedicated to advancing AI accessibility in Indian languages through the development of pretrained language models.

Thank you! With regards, Nitin


I am going to explain everything I do, but it would be wonderful if you can just make sure you are comfortable in the following at the basic level if not at an intermediate level.

  1. Intermediate knowledge of basic python covered in a small python training course which includes class, methods, functions, modules, etc.
  2. Basic knowledge of neural network gained by reading this online chapter:Using neural nets to recognize handwritten digits, What is a neural network?
  3. Basic knowledge of matrix multiplication and vector algebra. Linear Algebra
  4. A three minutes video on visualization of points in high dimensions shared by google: A.I. Experiments: Visualizing High-Dimensional Space
  5. Patience to have open mind in understanding why chatgpt can talk like humans.

Speaker Info:

Hello! This is Nitin Aggarwal, a cofounder of AI Research company called PradhiVrddhi Research. I wear a lot of hats in this company including CEO, AI Researcher, Data Scientist, ML Engineer, Developer, and usual stuff that startup founders do.

By education, I am a mathematician who comes from Indian Statistical Institute and University of Kansas. I have worked as lead of research in a cybersecurity company as well. I am coding in Python since 2015 when I first shifted my career from academia pursuing research in mathematics to industry entering as data analyst with hopes of making into research in the fields of Data Science, Machine Learning, and Artificial Intelligence. As you can see, I have partially achieved my goal for as long as I can keep the company afloat.

Currently, I am actively pursuing research in large language models or more specifically decoder-only text generators. I am fascinated by the field and would love to optimize these models enough so that every single laptop released after 2018 becomes able to train the large language models on small datasets for those individuals who are interested in creating a digital twin of themselves in terms of their knowledge.

About educational experience, I taught various undergraduate level course in college algebra and calculus for three years while I was at the University of Kansas.

You can get to know me better from my linkedin profile: Nitin Aggarwal

Section: Artificial Intelligence and Machine Learning
Type: Workshops
Target Audience: Intermediate
Last Updated: