The paper proposes LiDAR Diffusion Models (LiDMs), a novel generative framework for efficient and realistic LiDAR scene generation. The key contributions are:
LiDMs leverage range images as the data representation, which enables reversible and lossless conversion between range images and point clouds, and benefits from highly optimized 2D convolutional operations.
To achieve LiDAR-realistic generation, LiDMs incorporate three core designs: curve-wise compression to maintain the curve-like patterns, point-wise coordinate supervision to regularize the scene-level geometry, and patch-wise encoding to capture the full context of 3D objects.
LiDMs support diverse conditioning inputs, including semantic maps, camera views, and text prompts, enabling applications such as Semantic-Map-to-LiDAR, Camera-to-LiDAR, and zero-shot Text-to-LiDAR generation.
Extensive experiments demonstrate that LiDMs outperform previous state-of-the-art methods on both unconditional and conditional LiDAR scene generation, while achieving a significant speedup of up to 107x.
The paper introduces three novel perceptual metrics (FRID, FSVD, FPVD) to comprehensively evaluate the quality of generated LiDAR scenes.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Haoxi Ran,Vi... klokken arxiv.org 04-02-2024
https://arxiv.org/pdf/2404.00815.pdfDypere Spørsmål