Authors: Runjie Yan, Yinbo Chen, Xiaolong Wang
Abstract: Score Distillation Sampling (SDS) has made significant strides in distilling
image-generative models for 3D generation. However, its
maximum-likelihood-seeking behavior often leads to degraded visual quality and
diversity, limiting its effectiveness in 3D applications. In this work, we
propose Consistent Flow Distillation (CFD), which addresses these limitations.
We begin by leveraging the gradient of the diffusion ODE or SDE sampling
process to guide the 3D generation. From the gradient-based sampling
perspective, we find that the consistency of 2D image flows across different
viewpoints is important for high-quality 3D generation. To achieve this, we
introduce multi-view consistent Gaussian noise on the 3D object, which can be
rendered from various viewpoints to compute the flow gradient. Our experiments
demonstrate that CFD, through consistent flows, significantly outperforms
previous methods in text-to-3D generation.
Source: http://arxiv.org/abs/2501.05445v1