Authors: Gen Li, Yuling Yan
Abstract: Score-based generative models (SGMs) have revolutionized the field of
generative modeling, achieving unprecedented success in generating realistic
and diverse content. Despite empirical advances, the theoretical basis for why
optimizing the evidence lower bound (ELBO) on the log-likelihood is effective
for training diffusion generative models, such as DDPMs, remains largely
unexplored. In this paper, we address this question by establishing a density
formula for a continuous-time diffusion process, which can be viewed as the
continuous-time limit of the forward process in an SGM. This formula reveals
the connection between the target density and the score function associated
with each step of the forward process. Building on this, we demonstrate that
the minimizer of the optimization objective for training DDPMs nearly coincides
with that of the true objective, providing a theoretical foundation for
optimizing DDPMs using the ELBO. Furthermore, we offer new insights into the
role of score-matching regularization in training GANs, the use of ELBO in
diffusion classifiers, and the recently proposed diffusion loss.
Source: http://arxiv.org/abs/2408.16765v1