Improving image quality generations from IP Adapter to playing with attention layers : A stable way to diffusion approach

Anustup


1

Vote

Description:

We have all being talking a lot about capabilities of generative AI rather it be a LLM or Image generation pipelines. Though this talk will specifically focus on stable diffusions, as most of the companies are building exciting products using this. While we generate through a diffusion pipeline multiple concepts pops in like how much noise to play with, how does the exact U-net will behave with the noise, why LoRa fine tuning is not producing enough results and so on. Currently from generating fashion model photography, to educational contents, fantasy images holds a major role of photo-realism, where most of the straight forward diffusion scripts fails. So this talk will be all about some modern day approaches to handle those :

  1. Introduction and basic mathematical understanding of how a diffusion work ?
  2. Why GAN's failed ? why diffusion !
  3. How the attention layers exactly work inside a diffusion pipeline ?
  4. Working of standard mechanisms of Image 2 Image, Inpainting, unconditional & conditional generations.
  5. Common problems , I will pick practical problems from fashion and media industry.
  6. What is IP Adapater ? how does it solves?
  7. Finally will speak about a workflow in which we will talking about : You have a material image or a garment image , now you want to generate an image with exact same material or garment
    1. Will discuss loss and optimizations while the problem discussion
    2. QnA

Prerequisites:

  • Understanding of Calculus and Matrix algebra is needed
  • Understanding of Python, Machine learning models and basic working of CNN's
  • Understanding of Computer vision models or GAN's is a plus

Video URL:

< will submit >

Speaker Info:

Love to code and play around deep attention layers, imagining in better way through diffusions and retrieving through LLM's. Working as MLE at Caimera, automating brand's model photoshoots using Diffusions. Worked as MLE at Newton School, Shell, Samsung, Google Tensorflow(GSOC), Dark Horse , IIT Patna and some other startups. Ex-founder of MBK Health tech, which built remote wearables for cardiac monitoring and signaling. Holding research, patents , and Indian young achiever award ' 21 for AI contributions.

Speaker Links:

  • Linkedin : https://in.linkedin.com/in/anustupmukherjee
    • Youtube playlist : https://youtube.com/playlist?list=PL97HpUneNh5dqFQ22uy050-KKrgtsed_X&si=ed01HW4IpkPxTcOd
    • Github : Anustup900

Rest you can find me Googling :)

Section: Artificial Intelligence and Machine Learning
Type: Talk
Target Audience: Intermediate
Last Updated: