Meme Saheb - Using Dank Learning to Generate Original Meme Captions
Rumanu Bhardwaj (~rumanu) |
This is a guided walk-through on training your own deep neural net to generate funny meme captions! We rely on the architecture of a two layer cascaded CNN + LSTM network to do the computer vision and natural language processing tasks of identifying the meme template and then generating relevant captions.
The model is trained on a web scraped database of meme template label and caption mapping. We'll be using Vanilla Python, Keras, PyTorch and fast.ai to create this from scratch.
We'll start the workshop with a basic review of the fundamentals on which this project is based, with a littering of memes to make content consumption easier for the viewer.
Then, using the easy to use fast.ai library, step-by-step instructions will be provided to train a CNN for meme template recognition as a Computer Vision application, and build an LSTM to perform sentence construction catering to long term dependencies in a seq-to-seq NLP task. These models will be trained and the two elements will be cascaded for the generation of the final output.
The training data is web scraped and consists of standard meme template labels mapped to their captions, ranked by upvotes on internet forums substituting as a metric for hilarity.
They will as part of the tutorial, be provided troubleshooting assistance to generate relevant results till they can gleefully claim that they made AI generate memes!
The ways in which bias can creep in into such a system will be discussed as well as techniques and tools that we can use to avoid bias from affecting results in predictive AI.
This project was inspired by Abel L Peirson V and E Meltem Tolunay's Dank Learning. It has been ported to fast.ai in addition to using other Python libraries such as, NumPy, Pandas, Keras, and PyTorch and has been significantly improved in terms of generating better and faster results.
Workshop Outline by the minute:
- A (very) brief intro to the ML workflow, and Deep Learning using fast.ai [10-12 Minutes]
- Overview of why we’re using CNNs and LSTMs for the task at hand [10-12 Minutes]
- [Hands-On] Web Scraping Images of Memes to train our CNN [20-25 Minutes]
- [Hands-On] Creating a textual dataset of meme template labels mapped to captions [20-25 Minutes]
- [Hands-On] Training ResNet to identify Meme Templates and run tests to check for correctness of results [10 Minutes]
- [Hands-On] Creating an LSTM from scratch using fast.ai and training it using the text dataset [20-30 Minutes]
- Cascading the CNN and LSTM architectures, and run it end-to-end [10 Minutes]
- Discussion on ways in which bias can creep into our dataset and what we can do to avoid it from affecting our results [10-15 Minutes]
A basic idea of deep learning workflows helps, but is not required as a walkthrough on the fundamentals will be provided as part of the workshop.
The participants should have coded in Python prior to attending to benefit maximally from the workshop.
We'll be executing our code on Colab to avoid having to install dependencies on physical systems workshop participants are using at the time.
Data Driven Advertising Practitioner. Customer Privacy Advocate. Uses NLP to solve for humor(ous?) usecases.
I like words and wordplay and also data and language modeling, so i have in the past and intend to, in the future work towards solving subsets of the problem of detecting or generating humour in language.
This is primarily exemplified by having built a binary classifier for sarcasm detection in a global team of 5 women. [Code available for view here: https://github.com/festusdrakon/SaveSheldon]
In addition, I am currently working towards improving a model for double entendre identification, for the automated detection of whether a sentence can be used as a pun.
With regards to other work, I have been working with data driven marketing since 2014 and understand analytics, especially as applicable to cross-platform advertising deeply.
As an advocate for customer privacy while delivering customized permission marketing experiences, I have given a talk at Facebook Developer Circle in Delhi (Jan '20) to explain the basics of Differential Privacy and Federated Learning to the Developer Community as a Data Privacy practice while building cloud solutions.