Post

A Conditional Diffusion Model for Hand-drawing Generation

This homework implements an efficient CFG diffusion model that generates hand-drawing images conditioned on 12 class labels with only 6,586,369 model parameters and 100 diffusion steps.

Introduction

Given a hand-drawing image dataset containing the following categories: "airplane", "bus", "car", "fish", "guitar", "laptop", "pizza", "sea turtle", "star", "t-shirt", "The Eiffel Tower", "yoga", we aim to generate new hand-drawing images conditioned on these categories.

We adopt u-net based diffusion model to generate images. For conditioning, we use the Classifier-free Guidance technique.

Results

The generated results are shown below:

generated results Guidance Scale = 0.5 generated results Guidance Scale = 2.0 generated results Guidance Scale = 5.0

For more implementation details, please refer to the github page of this project.

This post is licensed under CC BY 4.0 by the author.