The paper presents SyncDreamer, a synchronized multiview diffusion model that generates multiview-consistent images from a single-view input image. The key idea is to model the joint probability distribution of multiview images, enabling the generation of consistent images across different views in a single reverse process.
The main highlights are:
SyncDreamer extends the diffusion framework to model the joint distribution of multiview images, introducing a synchronized multiview diffusion model. It constructs N shared noise predictors to simultaneously generate N images, where information across different images is shared among the noise predictors using a 3D-aware attention mechanism.
SyncDreamer retains strong generalization ability by initializing its weights from the pretrained Zero123 model, allowing it to reconstruct shapes from both photorealistic images and hand drawings.
SyncDreamer makes single-view reconstruction easier than distillation methods, as the generated multiview-consistent images can be directly used for 3D reconstruction without special losses.
SyncDreamer maintains creativity and diversity, enabling the generation of multiple reasonable objects from a given input image, unlike previous distillation methods that can only converge to one single shape.
Experiments on the Google Scanned Object dataset show that SyncDreamer outperforms baseline methods in terms of multiview consistency and 3D reconstruction quality.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yuan Liu,Che... at arxiv.org 04-16-2024
https://arxiv.org/pdf/2309.03453.pdfDeeper Inquiries