Zhao, Y., Lin, C., Lin, K., Yan, Z., Li, L., Yang, Z., Wang, J., Lee, G.H., & Wang, L. (2025). GenXD: Generating Any 3D and 4D Scenes. In Proceedings of the International Conference on Learning Representations (ICLR 2025).
This paper aims to address the challenges of 3D and 4D scene generation, particularly the lack of large-scale, diverse 4D datasets and the need for effective models that can handle both static and dynamic scenes with varying input views.
The authors introduce a data curation pipeline to create CamVid-30K, a large-scale 4D dataset with camera pose and object motion annotations derived from existing video datasets. They propose GenXD, a unified framework based on latent diffusion models, incorporating multiview-temporal modules to disentangle camera and object motion and masked latent conditioning to support single and multi-view image inputs. GenXD is trained on a combination of 3D and 4D datasets and evaluated on various tasks, including 4D scene and object generation, and few-view 3D reconstruction.
This research significantly contributes to the field of 3D and 4D scene generation by introducing a novel framework, GenXD, and a large-scale 4D dataset, CamVid-30K. GenXD's ability to generate high-quality, consistent scenes from various input views opens up new possibilities for applications in gaming, visual effects, and virtual reality.
This work addresses a critical gap in 3D and 4D content creation by providing a robust and versatile solution for generating realistic scenes from images. The introduction of CamVid-30K further facilitates research and development in this rapidly evolving field.
The paper acknowledges the computational demands of training and deploying such models. Future research could explore more efficient architectures and training strategies. Additionally, investigating the generation of higher-resolution scenes and incorporating semantic understanding for finer control over scene elements are promising directions.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yuyang Zhao,... at arxiv.org 11-05-2024
https://arxiv.org/pdf/2411.02319.pdfDeeper Inquiries