Core Concepts
DreamScene360 presents a novel pipeline that generates high-quality, immersive 360-degree 3D scenes from text prompts, without constraints on the scene complexity or content.
Abstract
DreamScene360 introduces a text-to-3D 360-degree scene generation pipeline that creates comprehensive 360-degree scenes for in-the-wild environments in a matter of minutes. The approach utilizes the generative power of a 2D diffusion model and prompt self-refinement to create a high-quality and globally coherent panoramic image. This panoramic image serves as the foundation for the 3D scene generation.
To transform the 2D panorama into a 3D scene, DreamScene360 initializes a geometric field and 3D Gaussians. Semantic and geometric correspondences are then employed as regularizations to optimize the 3D Gaussians, addressing the challenges posed by the single-view input and filling the gaps in the unseen regions.
The self-refinement process leverages GPT-4V to provide feedback and prompt revision suggestions, enhancing the visual quality and text-image alignment of the generated panoramas. This user-friendly feature eliminates the need for extensive prompt engineering, a common challenge in previous text-to-3D generation methods.
Experiments demonstrate that DreamScene360 outperforms the state-of-the-art in terms of global consistency and visual quality, providing a seamless, immersive experience for 360-degree 3D scene generation from text prompts.
Stats
The paper does not provide any specific numerical data or metrics to support the key logics. The evaluation is primarily based on qualitative comparisons and discussions.
Quotes
The paper does not contain any striking quotes that support the key logics.