The paper proposes LiDAR Diffusion Models (LiDMs), a novel generative framework for efficient and realistic LiDAR scene generation. The key contributions are:
LiDMs leverage range images as the data representation, which enables reversible and lossless conversion between range images and point clouds, and benefits from highly optimized 2D convolutional operations.
To achieve LiDAR-realistic generation, LiDMs incorporate three core designs: curve-wise compression to maintain the curve-like patterns, point-wise coordinate supervision to regularize the scene-level geometry, and patch-wise encoding to capture the full context of 3D objects.
LiDMs support diverse conditioning inputs, including semantic maps, camera views, and text prompts, enabling applications such as Semantic-Map-to-LiDAR, Camera-to-LiDAR, and zero-shot Text-to-LiDAR generation.
Extensive experiments demonstrate that LiDMs outperform previous state-of-the-art methods on both unconditional and conditional LiDAR scene generation, while achieving a significant speedup of up to 107x.
The paper introduces three novel perceptual metrics (FRID, FSVD, FPVD) to comprehensively evaluate the quality of generated LiDAR scenes.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Haoxi Ran,Vi... at arxiv.org 04-02-2024
https://arxiv.org/pdf/2404.00815.pdfDeeper Inquiries