Concetti Chiave
GD2-NeRF is a coarse-to-fine generative detail compensation framework that hierarchically includes GAN and pre-trained diffusion models into One-shot Generalizable Neural Radiance Fields (OG-NeRF) to synthesize novel views with vivid plausible details in an inference-time finetuning-free manner.
Sintesi
The paper proposes the GD2-NeRF framework to address the limitations of existing OG-NeRF methods, which suffer from blurry outputs due to the high reliance on the limited reference image.
At the coarse stage, the One-stage Parallel Pipeline (OPP) efficiently injects a GAN model into the OG-NeRF pipeline to capture in-distribution detail priors from the training dataset, achieving a good balance between sharpness and fidelity.
At the fine stage, the Diffusion-based 3D Enhancer (Diff3DE) further leverages the pre-trained image diffusion models to complement rich out-distribution details while maintaining decent 3D consistency. Diff3DE relaxes the input of the original Inflated Self-Attention (ISA) from all keyframes to neighbor keyframe sets selected based on view distance, enabling the processing of arbitrary views.
Extensive experiments on synthetic and real-world datasets show that GD2-NeRF noticeably improves the details while remaining inference-time finetuning-free.
Statistiche
Given a single reference image, our method GD2-NeRF synthesizes novel views with vivid plausible details in an inference-time finetuning-free manner.
Our coarse-stage method OPP shows noticeable improvements over the baseline methods with balanced sharpness and fidelity while with little additional cost.
Our fine-stage method Diff3DE can further compensate rich plausible details with decent 3D-consistency.
Citazioni
"GD2-NeRF is a coarse-to-fine generative detail compensation framework that hierarchically includes GAN and pre-trained diffusion models into One-shot Generalizable Neural Radiance Fields (OG-NeRF) to synthesize novel views with vivid plausible details in an inference-time finetuning-free manner."
"At the coarse stage, the One-stage Parallel Pipeline (OPP) efficiently injects a GAN model into the OG-NeRF pipeline to capture in-distribution detail priors from the training dataset, achieving a good balance between sharpness and fidelity."
"At the fine stage, the Diffusion-based 3D Enhancer (Diff3DE) further leverages the pre-trained image diffusion models to complement rich out-distribution details while maintaining decent 3D consistency."