Diffusion Models for Pythonistas
dasayan05 |
Description:
In recent years, Diffusion Models played a pivotal role in revolutionizing how visual contents like images and videos are created and manipulated. With breakthrough models like Stable-Diffusion & DALL-E pushing the boundaries, understanding diffusion models becomes paramount for anyone venturing into the realm of generative art and content creation. This two-part talk aims to unravel the magic behind diffusion models, starting with foundational concepts presented in a beginner-friendly manner, accompanied by intuitive visualizations. The latter part will delve into practical applications, showcasing how diffusion models can be harnessed to create stunning visual contents like images and videos from text and other control signals. We will make use of the diffusers
library, an open-source Python library containing high quality implementations of Diffusion Models.
A (tentative) high-level plan of the talk is provided below:
- (15 mins) Unconditional Diffusion Models
- Iterative de-noising (The “reverse” process)
- The noising, i.e. “forward” process (The
scheduler
) - Learning the noise with a
model
- Encapsulating additional components (The
pipeline
)
- (15 mins) Example applications
- Unconditional image generation
- Text-to-image generation
- Image-guided text-to-image generation
- Animation and video generation
Prerequisites:
- Python
- Basics of Deep Learning
Content URLs:
Materials for Part 1:
- Slides on Diffusion Models fundamentals (Tentative; contents are subject to changes)
Materials for Part 2:
- Slides on applications using
diffusers
(Tentative; contents are subject to changes)
Speaker Info:
Speaker 1: Ayan holds a PhD in AI/ML and works as a Research Scientist at MediaTek Research UK. His research interest spans over theoretical fundamentals to applied problems, currently invested heavily on Diffusion Models. His technical blogs, specifically on Diffusion, are particularly popular and have also authored several papers at top tier conferences (e.g. NeurIPS, ICLR, ICML etc). He is a devoted Pythonista and a math lover.
- Ayan’s blogs (one, two and three) on Diffusion Models
- Full list of publications
- An invited talk at DataHour, Analytics Vidhya
Speaker 2: Sayak works on diffusion models at 🤗 Hugging Face. He is one of the maintainers of the diffusers
library. He spends time contributing meaningfully impactful features to the library, training and babysitting diffusion models, and doing a bit of applied research in the space. Off the work, Sayak can be found binging Suits for the n-th time.
- Sayak on a podcast interview
- The diffusers library
- Blogs (collection one and collection two), publications and talks
Speaker Links:
Ayan Das:
Sayak Paul: