CS 180 Project 5

Kevin Sheng


CS 180 Project 5: Fun with Diffusion Models

Project Overview

The goal of this project is to further my understanding of image representation and manipulation, and produce large sections of image that is the product of multiple individual images stitched together.




Part 0: Setup

We first set up the diffusion model. The random seed used for this project is 61.

The 64*64 images with inference step values of 20
The 256*256 images with inference step values of 20
The 64*64 images with inference step values of 30
The 256*256 images with inference step values of 30



Part 1: Sampling Loops




1.1: Implementing the Forward Process

To start off, we introduce noise into our test image with the forward process.

The original test image
Noise at t=250
Noise at t=500
Noise at t=750



1.2: Classical Denoising

To start denoising, we can first try the classical gaussian blur filtering.

Noise at t=250
Noise at t=500
Noise at t=750
Gaussian blur denoising at t=250
Gaussian blur denoising at t=500
Gaussian blur denoising at t=750



1.3: One-Step Denoising

Now, we can actually use the diffusion model to denoise the images. We can start off by using one step denoising.

The original test image
Noise at t=250
Noise at t=500
Noise at t=750
One-step denoising at t=250
One-step denoising at t=500
One-step denoising at t=750



1.4: Iterative Denoising

Diffusion models are designed to denoise iteratively. In this part we will implement this.

Noisy Campanile at t=90
Noisy Campanile at t=240
Noisy Campanile at t=390
Noisy Campanile at t=540
Noisy Campanile at t=690
The original test image
Iteratively denoised Campanile
One-step denoised Campanile
Gaussian blur filtered Campanile



1.5: Diffusion Model Sampling

In part 1.4, we use the diffusion model to denoise an image. Another thing we can do with the iterative_denoise function is to generate images from scratch.

Sample 1
Sample 2
Sample 3
Sample 4
Sample 5



1.6: Classifier-Free Guidance (CFG)

In order to greatly improve image quality (at the expense of image diversity), we can use a technicque called Classifier-Free Guidance.

Sample 1 with CFG
Sample 2 with CFG
Sample 3 with CFG
Sample 4 with CFG
Sample 5 with CFG



1.7: Image-to-image Translation

In part 1.4, we take a real image, add noise to it, and then denoise. This effectively allows us to make edits to existing images. Here, we're going to take the original test image, noise it a little, and force it back onto the image manifold without any conditioning. Effectively, we're going to get an image that is similar to the test image (with a low-enough noise level).

The original image
SDEdit with i_start=1
SDEdit with i_start=3
SDEdit with i_start=5
SDEdit with i_start=7
SDEdit with i_start=10
SDEdit with i_start=20
The original image
SDEdit with i_start=3
SDEdit with i_start=5
SDEdit with i_start=7
SDEdit with i_start=10
SDEdit with i_start=20
SDEdit with i_start=1
The original image
SDEdit with i_start=3
SDEdit with i_start=5
SDEdit with i_start=7
SDEdit with i_start=10
SDEdit with i_start=20
SDEdit with i_start=1



1.7.1: Editing Hand-Drawn and Web Images

This procedure works particularly well if we start with a nonrealistic image (e.g. painting, a sketch, some scribbles) and project it onto the natural image manifold.

The original image
Avocado at i_start=1
Avocado at i_start=3
Avocado at i_start=5
Avocado at i_start=7
Avocado at i_start=10
Avocado at i_start=20
The original image
Bee at i_start=1
Bee at i_start=3
Bee at i_start=5
Bee at i_start=7
Bee at i_start=10
Bee at i_start=20
The original image
Guy at i_start=1
Guy at i_start=3
Guy at i_start=5
Guy at i_start=7
Guy at i_start=10
Guy at i_start=20



1.7.2: Inpainting

We can use the same procedure to implement inpainting.

The original image
The mask
Campanile inpainted
The original image
The mask
Cat inpainted
The original image
The mask
Dog inpainted



1.7.3: Text-Conditional Image-to-image Translation

Now, we will do the same thing as SDEdit, but guide the projection with a text prompt.

The original image
Rocket at noise level 3
Rocket at noise level 5
Rocket at noise level 7
Rocket at noise level 10
Rocket at noise level 1
Rocket at noise level 20
The original image
Coast at noise level 3
Coast at noise level 5
Coast at noise level 7
Coast at noise level 10
Coast at noise level 1
Coast at noise level 20
The original image
Coast at noise level 3
Coast at noise level 5
Coast at noise level 7
Coast at noise level 10
Coast at noise level 1
Coast at noise level 20



1.8: Visual Anagrams

In this part, we will create an image that looks like "an oil painting of people around a campfire", but when flipped upside down will reveal "an oil painting of an old man".

an oil painting of an old man
a man wearing a hat
a photo of the amalfi cost
an oil painting of people around a campfire
a rocket ship
a photo of a dog



1.9: Hybrid Images

In this part we'll implement Factorized Diffusion and create hybrid images just like in project 2.

Hybrid image of a skull and a waterfall
Hybrid image of a skull and a waterfall
Hybrid image of a barista and an old man
Hybrid image of a barista and an old man
Hybrid image of an old man and a skull