Authors: Gen Li, Yuling Yan
Abstract: Score-based diffusion models, which generate new data by learning to reverse
a diffusion process that perturbs data from the target distribution into noise,
have achieved remarkable success across various generative tasks. Despite their
superior empirical performance, existing theoretical guarantees are often
constrained by stringent assumptions or suboptimal convergence rates. In this
paper, we establish a fast convergence theory for a popular SDE-based sampler
under minimal assumptions. Our analysis shows that, provided
$\ell_{2}$-accurate estimates of the score functions, the total variation
distance between the target and generated distributions is upper bounded by
$O(d/T)$ (ignoring logarithmic factors), where $d$ is the data dimensionality
and $T$ is the number of steps. This result holds for any target distribution
with finite first-order moment. To our knowledge, this improves upon existing
convergence theory for both the SDE-based sampler and another ODE-based
sampler, while imposing minimal assumptions on the target data distribution and
score estimates. This is achieved through a novel set of analytical tools that
provides a fine-grained characterization of how the error propagates at each
step of the reverse process.
Source: http://arxiv.org/abs/2409.18959v1