Kernkonzepte
By imitating the imaging principle of real-world depth sensors, RaSim effectively bridges the sim-to-real domain gap concerning depth maps and produces high-fidelity RGB-D data. A range-aware rendering strategy is further introduced to enrich data diversity.
Zusammenfassung
The authors introduce RaSim, a range-aware RGB-D data simulation pipeline that excels in producing high-fidelity RGB-D data.
Key highlights:
- RaSim imitates the imaging principle of real-world depth sensors, such as the Intel RealSense D400 series, to generate high-quality simulated depth maps.
- A range-aware rendering strategy is incorporated to enrich data diversity. For nearby scenes, stereo IR images are used for depth simulation, while for distant scenes, binocular RGB images are used.
- The RaSim pipeline is implemented using Kubric, a dataset generator that interfaces with Blender and PyBullet. It creates a large-scale synthetic RGB-D dataset with over 206K images across 9,835 diverse scenes, featuring physical simulations, comprehensive annotations, and domain randomization.
- Experiments on depth completion and depth pre-training tasks demonstrate that models trained with the RaSim dataset can be directly applied to real-world datasets like ClearGrasp and YCB-V without the need for finetuning, effectively bridging the sim-to-real domain gap.
Statistiken
The RaSim pipeline generates large-scale synthetic RGB-D datasets with over 206K images across 9,835 diverse scenes.
Zitate
"By imitating the imaging principle of the stereo camera based on the RealSense D400 series, we first generate large corpora of virtual scenes with photo-realistic object models, diverse backgrounds, global illuminations, and physical simulations."
"To alleviate this problem, we propose a range-aware rendering strategy. Recapping Fig. 1, for nearby scenes where the camera and objects are close, we perform stereo matching with IR images. While for distant scenes, the matching is based on binocular RGB images with richer texture information and brighter light illumination."