DreamScape presents a novel approach for creating 3D scenes from textual prompts. The key component is the 3D Gaussian Guide (3DG2), which is derived from text prompts using Large Language Models (LLMs) and serves as a comprehensive representation of the scene. 3DG2 includes semantic primitives (objects), their spatial transformations, and scene correlations, enabling DreamScape to employ a local-global generation strategy.
During local optimization, DreamScape uses a progressive scale control technique to ensure the scale of each object aligns with the overall scene. At the global level, a collision loss between objects is used to prevent intersection and misalignment, addressing potential spatial biases of 3DG2 and ensuring physical correctness.
To model pervasive objects like rain and snow, DreamScape introduces sparse initialization and corresponding densification and pruning strategies. This allows for more realistic representation of such objects distributed extensively across the scene.
Experiments demonstrate that DreamScape can generate high-fidelity 3D scenes from text prompts, outperforming existing state-of-the-art methods in terms of semantic accuracy, visual quality, and multi-view consistency. The approach also supports flexible editing capabilities, enabling users to modify object positions, scales, and rotations.
เป็นภาษาอื่น
จากเนื้อหาต้นฉบับ
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by Xuening Yuan... ที่ arxiv.org 04-16-2024
https://arxiv.org/pdf/2404.09227.pdfสอบถามเพิ่มเติม