Neural network architectures in practice
I will talk about the importance of neural network architectures for data modeling in real world use-cases, while focusing on the evolution of specific neural network architectures over time.
I will walk through a real world computer vision use-case to illustrate the major advantages of newer neural network architectures over their predecessors in the context of convergence and generalization for this particular use-case, while focusing on the details in the newer architectures that help them outperform older ones (All comparisons in the Computer Vision section will be made using theoretical knowledge of the model architectures and the performance of these models on known benchmarks and tasks) . The specific architectures I will be talking about are as follows:
COMPUTER VISION MODELS
Convolutional Neural Networks (CNNs):
- Convolutional Neural Networks are a class of deep learning models that mimics the structure of neurons in the Visual Cortex. The network works by recognizing basic features in images, such as edges and shapes. It is used to several problems in Computer Vision, such as image classification, object detection, etc.
- Capsule Networks are a relatively new class of deep learning models that, in addition to recognizing edges and shapes in images, can understand more complex attributes like the relative proportions, positions and orientations of objects in an image. This ability of Capsule Networks helps it achieve much better scores than CNNs on several Computer Vision tasks.
Residual Network (ResNet):
- Residual Neural Networks are based on CNNs, but they have something called “skip-connections” in which information can “skip” one neuron layer and information flows directly to the layer after that. This is inspired by the human brain structure and it helps boost the performance a lot.
I will talk about the above models in the context of a real world computer vision use-case and compare them on criteria such as convergence, generalization, scalability, etc. I will also talk about the differences in the model architectures that result in them performing better or worse.
Then, I will talk about the architecture of current state-of-the-art NLP models (GPT-2 and XLNet). I will walk through a real world natural language processing (NLP) use-case to demonstrate the ability of these NLP models to scale using GPUs, while focusing on the specific details in their architecture that enables them to outperform older models. (All demos and comparisons in the NLP section will be done live using IPython notebooks)
- XLNET is one of the best pre-trained NLP models in the world that achieves state-of-the-art results on several NLP tasks, such as text classification and question answering.
- GPT-2 is also one of the most successful pre-trained NLP models. It is a close competitor of XLNET and BERT (another pretrained NLP model).
I will talk about the above models in the context of a real world natural language processing use-case and compare them on criteria such as convergence, generalization, scalability, etc. I will also talk about the differences in the model architectures that result in them performing better or worse. The main focus will be on the details in the architecture of these models that make them more successful than their predecessors.
Basic knowledge of machine learning and Python programming language
Quora Insincere Question Classification Project (Natural Language Processing)
- I solved this problem by fine-tuning the BERT pre-trained NLP model using Python.
- Here is the GitHub repository for the project
This talk will focus on the following neural network architectures:
I'm Tarun Sriranga Paparaju and I am a 17-year-old high school student. I love solving data science problems and I am an open-source contributor to deep learning libraries on GitHub (github.com/SriRangaTarun) and a triple expert on Kaggle (kaggle.com/tarunpaparaju). I have won 2 bronze medals in data science competitions on Kaggle and finished 3rd out of 1228 contestants in this competition. I am enthusiastic about learning and sharing about the internals of neural network architectures on my blog (srirangatarun.wordpress.com). I have a keen interest in algebra and vector calculus with a strong programming background in Python language.