toplogo
התחברות
תובנה - Computer Vision - # 3D Scene Generation

SceneCraft: Generating 3D Indoor Scenes from Text Descriptions and User-Defined Layouts


מושגי ליבה
SceneCraft is a novel method for generating high-quality 3D indoor scenes by combining user-specified textual descriptions and spatial layouts, overcoming limitations of previous approaches by supporting complex multi-room scenes and free camera trajectories.
תקציר
  • Bibliographic Information: Yang, X., Man, Y., Chen, J.-K., & Wang, Y.-X. (2024). SceneCraft: Layout-Guided 3D Scene Generation. In Advances in Neural Information Processing Systems (Vol. 38).

  • Research Objective: This paper introduces SceneCraft, a novel framework for generating high-quality 3D indoor scenes that adhere to both textual descriptions and user-defined spatial layouts.

  • Methodology: SceneCraft utilizes a two-stage approach. First, a 2D diffusion model, SceneCraft2D, is trained to generate high-fidelity 2D images conditioned on rendered "bounding-box images" (BBI) derived from user-specified 3D bounding box layouts. Second, a distillation-guided process leverages SceneCraft2D's generation capabilities to optimize a 3D scene representation (e.g., NeRF), gradually refining the scene geometry and texture based on the generated multi-view images.

  • Key Findings: SceneCraft demonstrates superior performance compared to existing text-to-3D and layout-guided generation methods, achieving higher scores in CLIP Score, 3D consistency, and visual quality. The framework effectively handles complex indoor layouts beyond single rooms, including multi-story houses with irregular shapes, and supports free camera trajectories, surpassing the limitations of panorama-based approaches.

  • Main Conclusions: SceneCraft presents a significant advancement in 3D scene generation by enabling precise user control over both scene content and spatial arrangement. The proposed method effectively combines the strengths of 2D diffusion models and 3D scene representations, paving the way for more interactive and user-friendly 3D content creation tools.

  • Significance: This research significantly contributes to the field of computer vision, particularly in 3D scene generation and understanding. It offers a promising solution for various applications, including virtual and augmented reality, video game development, and embodied AI simulations.

  • Limitations and Future Research: While SceneCraft demonstrates impressive results, future work could explore incorporating more sophisticated object representations beyond bounding boxes and investigating the generation of dynamic scenes with moving objects and changing lighting conditions.

edit_icon

התאם אישית סיכום

edit_icon

כתוב מחדש עם AI

edit_icon

צור ציטוטים

translate_icon

תרגם מקור

visual_icon

צור מפת חשיבה

visit_icon

עבור למקור

סטטיסטיקה
SceneCraft achieves a CLIP Score of 24.34, outperforming Text2Room (22.98) and MVDiffusion (23.85). In user studies, SceneCraft scored 3.71 for 3D consistency and 3.56 for visual quality, both higher than the baseline methods.
ציטוטים
"We introduce SceneCraft, a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences provided by users." "Central to our method is a rendering-based technique, which converts 3D semantic layouts into multi-view 2D proxy maps." "Without the constraints of panorama image generation, we surpass previous methods in supporting complicated indoor space generation beyond a single room, even as complicated as a whole multi-bedroom apartment with irregular shapes and layouts."

תובנות מפתח מזוקקות מ:

by Xiuyu Yang, ... ב- arxiv.org 10-14-2024

https://arxiv.org/pdf/2410.09049.pdf
SceneCraft: Layout-Guided 3D Scene Generation

שאלות מעמיקות

0
star