One-step Diffusion Models with $f$-Divergence Distribution Matching

Authors: Yilun Xu, Weili Nie, Arash Vahdat

Abstract: Sampling from diffusion models involves a slow iterative process that hinders
their practical deployment, especially for interactive applications. To
accelerate generation speed, recent approaches distill a multi-step diffusion
model into a single-step student generator via variational score distillation,
which matches the distribution of samples generated by the student to the
teacher’s distribution. However, these approaches use the reverse
Kullback-Leibler (KL) divergence for distribution matching which is known to be
mode seeking. In this paper, we generalize the distribution matching approach
using a novel $f$-divergence minimization framework, termed $f$-distill, that
covers different divergences with different trade-offs in terms of mode
coverage and training variance. We derive the gradient of the $f$-divergence
between the teacher and student distributions and show that it is expressed as
the product of their score differences and a weighting function determined by
their density ratio. This weighting function naturally emphasizes samples with
higher density in the teacher distribution, when using a less mode-seeking
divergence. We observe that the popular variational score distillation approach
using the reverse-KL divergence is a special case within our framework.
Empirically, we demonstrate that alternative $f$-divergences, such as
forward-KL and Jensen-Shannon divergences, outperform the current best
variational score distillation methods across image generation tasks. In
particular, when using Jensen-Shannon divergence, $f$-distill achieves current
state-of-the-art one-step generation performance on ImageNet64 and zero-shot
text-to-image generation on MS-COCO. Project page:
https://research.nvidia.com/labs/genair/f-distill

Source: http://arxiv.org/abs/2502.15681v1

Archives

Categories

One-step Diffusion Models with $f$-Divergence Distribution Matching

About the Author

user

Leave a Reply Cancel reply

Recent Posts

Recent Comments

You may also like these

VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind

FLEKE: Federated Locate-then-Edit Knowledge Editing

BOSS: Benchmark for Observation Space Shift in Long-Horizon Task