Diffusion Models for Pythonistas





In recent years, Diffusion Models played a pivotal role in revolutionizing how visual contents like images and videos are created and manipulated. With breakthrough models like Stable-Diffusion & DALL-E pushing the boundaries, understanding diffusion models becomes paramount for anyone venturing into the realm of generative art and content creation. This two-part talk aims to unravel the magic behind diffusion models, starting with foundational concepts presented in a beginner-friendly manner, accompanied by intuitive visualizations. The latter part will delve into practical applications, showcasing how diffusion models can be harnessed to create stunning visual contents like images and videos from text and other control signals. We will make use of the diffusers library, an open-source Python library containing high quality implementations of Diffusion Models.

A (tentative) high-level plan of the talk is provided below:

  1. (15 mins) Unconditional Diffusion Models
    1. Iterative de-noising (The “reverse” process)
    2. The noising, i.e. “forward” process (The scheduler)
    3. Learning the noise with a model
    4. Encapsulating additional components (The pipeline)
  2. (15 mins) Example applications
    1. Unconditional image generation
    2. Text-to-image generation
    3. Image-guided text-to-image generation
    4. Animation and video generation


  • Python
  • Basics of Deep Learning

Content URLs:

Materials for Part 1:

  • Slides on Diffusion Models fundamentals (Tentative; contents are subject to changes)

Materials for Part 2:

  • Slides on applications using diffusers (Tentative; contents are subject to changes)

Speaker Info:

Speaker 1: Ayan holds a PhD in AI/ML and works as a Research Scientist at MediaTek Research UK. His research interest spans over theoretical fundamentals to applied problems, currently invested heavily on Diffusion Models. His technical blogs, specifically on Diffusion, are particularly popular and have also authored several papers at top tier conferences (e.g. NeurIPS, ICLR, ICML etc). He is a devoted Pythonista and a math lover.

Speaker 2: Sayak works on diffusion models at 🤗 Hugging Face. He is one of the maintainers of the diffusers library. He spends time contributing meaningfully impactful features to the library, training and babysitting diffusion models, and doing a bit of applied research in the space. Off the work, Sayak can be found binging Suits for the n-th time.

Speaker Links:

Ayan Das:

Sayak Paul:

Section: Artificial Intelligence and Machine Learning
Type: Talk
Target Audience: Intermediate
Last Updated: