Khái niệm cốt lõi
A novel approach for jointly estimating missing geometry and semantics from sparse LiDAR point clouds using denoising diffusion probabilistic models.
Tóm tắt
The paper proposes a novel approach called DiffSSC for semantic scene completion (SSC) using denoising diffusion probabilistic models (DDPMs). SSC aims to jointly predict unobserved geometry and semantics in a scene given raw LiDAR measurements, providing a more complete scene representation.
The key contributions are:
- Utilizing DDPMs for the SSC task, introducing a residual-learning mechanism compared to traditional approaches that directly estimate the complete scene from partial input.
- Separately modeling the point and semantic spaces to adapt to the diffusion process.
- Operating directly on the point cloud, avoiding quantization errors and reducing memory usage, while making it a more efficient method for LiDAR point clouds.
- Designing local and global regularization losses to stabilize the learning process.
The proposed DiffSSC model first semantically segments the raw LiDAR point cloud using Cylinder3D. The semantic point cloud is then upsampled and undergoes a forward diffusion and reverse denoising process, refining the positions and semantics. The original semantic point cloud serves as a conditional input to guide the generation of points in gaps and occluded areas. Finally, a refinement model based on MinkUNet is used to increase the density of the generated scene.
Experiments on the SemanticKITTI and SSCBench-KITTI360 datasets show that DiffSSC outperforms the state-of-the-art methods for semantic scene completion.
Thống kê
The model is trained and validated on the SemanticKITTI dataset, using sequences 00-06 for training and sequences 09-10 for validation.
The model is evaluated on the official validation sets of both the SemanticKITTI (sequence 08) and SSCBench-KITTI360 (sequence 07) datasets.
Trích dẫn
"Perception systems collect low-level attributes of the surrounding environment, such as depth, temperature, and color, through various sensor technologies. These systems leverage machine learning algorithms to achieve high-level understanding, such as object detection and semantic segmentation."
"To provide dense and semantic scene representations for downstream decision-making and action systems, Semantic Scene Completion (SSC) has been proposed, aimed at jointly predicting missing points and semantics from raw LiDAR point clouds."