Synthesising Images from text using Generative Adversarial Networks

Sairam tabibu (~sairam)


43

Votes

Description:

enter image description here

The workshop is intended to introduce, explore and get a hands on experience on one of the most interesting application of GENERATIVE ADVERSARIAL NETWORKS which is - given the description of an image, the GAN model generates an image according to that description.

The workshop is to be divided in two parts: 1. Giving a hands on of using word embeddings to encapsulate the textual information and basics of how to train a vanilla GAN. 2. Combining the word embedding and training a 2 stage stacked GAN to generate relevant Image ( We will be providing with pre-trained models as training takes a lot of time )

The workshop would then aim to go over the plausible applications that it could have. The first part of the workshop will be as follows:

  1. We would be teaching basics aspects of NLP i.e word embeddings with hands on experience of python libraries NLTK etc.
  2. We would be then moving on to the next part where we will teach the basics of how to train a vanilla GAN on their laptops using Pytorch followed by a simple application.
  3. We will be providing the audience with Jupyter notebooks with skeleton code and the remaining code will be written on the spot. Aim of the teaching the training procedure is to get the audience a hang of what parameters to keep in mind while training a Neural Network.

The second part of the workshop will be as follows

  1. We will be training the Gan using the word embeddings to get a rough Image representation followed by another GAN ( stacked one after other ) to get a full resolution image ( details given in Paper )
  2. We will be providing the trained model of GAN as it requires a lot of time to train the GAN.
  3. We will be providing the Jupyter Notebooks giving the architecture and will be writing some parts of the Stacked GAN’s on the spot.
  4. We will be discussing the possible applications of GAN’s in both research and industry. Some industrial application - https://www.technologyreview.com/s/608668/amazon-has-developed-an-ai-fashion-designer/
    https://wwd.com/business-news/technology/ibm-watson-fashion-week-analysis-10842213/

Prerequisites:

Basics of NLP ( word embedding ), Basics of Neural Network, Basics of Python numpy and Pytorch.

Content URLs:

http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhang_StackGAN_Text_to_ICCV_2017_paper.pdf

https://pytorch.org/

Basic Summary of the workshop - goo.gl/p8Xhn6

Workshop part 1 slides - goo.gl/ZX2tTP

Workshop part 2 slides - goo.gl/y1oSc1

Workshop Notebooks ( in progress ) -

Speaker Info:

I ( Sairam ) am currently a research associate at Center for Visual Information Technology, IIIT Hyderabad. I graduated from Electronics Engineering from IIT BHU last year. My experience with Computer Vision is of 4 years, with varied internships at CWNU, South Korea working on face recognition, NTU Singapore working on Maritime vessel detection to Crowd modelling. I’m currently working on Cancer detection from slide images of cancerous tissues.I have been the lead of many workshops and tutorials teaching basics of Vision conducted at my college, also a workshop on " Deception Detection " at Pycon 2016 ( sllides - https://docs.google.com/presentation/d/1tHt5EPol3KLu81E8aKlebZT3FH0uvPky4l3bHEwVbEQ/edit#slide=id.g162ec24d83_0_0 ).

Zeeshan is currently a research fellow at Center for Visual Information Technology, IIIT Hyderabad. He has graduated in Electrical Engineering from VJTI, Mumbai. He has an experience of 2 years in developing trading systems at Citi. Currently he is working on gradient estimation for stochastic neural networks.

Speaker Links:

My LinkedIn profile can be viewed at: https://www.linkedin.com/in/sairamtabibu/

Zeeshan's Profile: https://www.linkedin.com/in/zeeshan-ashraf-508587137/

Section: Data science
Type: Workshops
Target Audience: Intermediate
Last Updated: