Основні поняття
WonderJourney, a modular framework, generates a long sequence of diverse yet coherently connected 3D scenes starting from any user-provided location, enabling users to journey through their own "wonderjourneys".
Анотація
The paper introduces WonderJourney, a novel framework for perpetual 3D scene generation. Unlike prior work that focuses on generating a single type of scene, WonderJourney aims to synthesize a series of diverse 3D scenes starting from an arbitrary location specified via a single image or language description. The generated 3D scenes should be coherently connected along a long-range camera trajectory, allowing users to experience a visual journey through various plausible places.
The key challenges addressed by WonderJourney include generating diverse yet plausible scene elements, determining their spatial layout, and ensuring geometric consistency across the connected scenes. The framework decomposes this task into three main modules:
Scene Description Generation: An LLM is used to auto-regressively generate a series of textual descriptions for the next scenes in the journey.
Visual Scene Generation: A text-driven visual generation module takes the scene descriptions and the current scene image to produce the next 3D scene as a colored point cloud. This involves depth estimation, depth refinement, and text-guided outpainting to handle issues like depth discontinuity and sky depth inaccuracy.
Visual Validation: A VLM is used to detect and reject any undesired visual artifacts, such as painting frames or out-of-focus objects, triggering a re-generation of the scene.
The modular design of WonderJourney allows it to leverage the latest advancements in vision and language models. Qualitative results demonstrate that WonderJourney can generate diverse and coherent "wonderjourneys" starting from various types of inputs, including real photos and generated art. A user study also shows that WonderJourney is strongly preferred over state-of-the-art perpetual view generation baselines in terms of diversity, visual quality, scene complexity, and overall interestingness.
Статистика
The paper does not provide any specific numerical data or statistics. The focus is on qualitative results and a user study comparing WonderJourney with baseline methods.
Цитати
"No, no! The adventures first, explanations take such a dreadful time." – Alice's Adventures in Wonderland