SceneSense: 3D Occupancy Synthesis from Partial Observations
Kernekoncepter
SceneSense is a real-time 3D diffusion model that predicts occluded geometries for future planning and control in robotics.
Resumé
SceneSense introduces a real-time 3D diffusion model for synthesizing 3D occupancy information from partial observations.
The framework uses a running occupancy map and a single RGB-D camera to generate predicted geometry around the platform at runtime.
By preserving observed space integrity, SceneSense mitigates the risk of corrupting observed areas with generative predictions.
The method enhances local occupancy predictions around the platform, showing better representation of ground truth occupancy distribution than the running occupancy map.
The architecture allows for extension to additional modalities beyond a single RGB-D camera.
I. Introduction:
Humans rely on 'common sense' inferences, while robots are limited to decisions based on directly measured data like cameras or lidar.
SceneSense aims to address this gap by predicting out-of-view or occluded geometry using generative AI methods.
II. Related Works:
Semantic Scene Completion (SSC) focuses on generating dense semantically labeled scenes from sparse representations in target areas.
Generative 3D Scene Synthesis involves constructing 3D scenes from multiple camera views using Neural Radiance Fields (NeRFs).
III. Preliminaries and Problem Definition:
Dense Occupancy Prediction aims to predict occupancy values for every voxel in a target region between 0 and 1.
IV. Method:
The architecture includes denoising networks, feature extraction, conditioning, and occupancy mapping for accurate predictions.
V. Experiments:
Habitat lab simulation platform and HM3D dataset are used for training and testing data generation.
VI. Results:
Quantitative comparisons show that SceneSense outperforms baseline methods in terms of FID and KID metrics.
VII. Conclusions and Future Work:
SceneSense presents a promising approach for generative local occupancy prediction in robotics applications.