المفاهيم الأساسية
A novel frame-to-model optimization framework for unsupervised RGB-D registration that leverages neural radiance fields (NeRF) to enhance robustness against multi-view inconsistency factors.
الملخص
This paper proposes NeRF-UR, an unsupervised RGB-D registration framework that leverages neural radiance fields (NeRF) to overcome the limitations of existing frame-to-frame optimization methods.
The key insights are:
- Instead of enforcing photometric and geometric consistency between two registered frames, NeRF-UR uses the NeRF as a global model of the scene and optimizes the poses by enforcing consistency between the input frames and the NeRF-rerendered frames. This design can better handle multi-view inconsistency factors such as lighting changes, geometry occlusion and reflective materials.
- To bootstrap the NeRF optimization, the authors create a synthetic dataset, Sim-RGBD, through photo-realistic simulation. They first train the registration model on Sim-RGBD with ground-truth poses, and then fine-tune it on real-world data in an unsupervised manner. This enables distilling the capability of feature extraction and registration from simulation to reality.
Extensive experiments on ScanNet and 3DMatch datasets demonstrate that NeRF-UR outperforms state-of-the-art supervised and unsupervised RGB-D registration methods, especially in challenging scenarios with low overlap or severe lighting changes.
الإحصائيات
The registration model can achieve 97.2% rotation accuracy (5°), 84.2% translation accuracy (5cm), and 93.2% Chamfer distance accuracy (1mm) on the ScanNet dataset.
Compared to the current state-of-the-art method PointMBF, NeRF-UR gains 2.6 percentage points in rotation accuracy, 3.2 percentage points in translation accuracy, and 1.9 percentage points in Chamfer distance accuracy on the 3DMatch dataset.
اقتباسات
"To overcome the reliance on annotated data in learning-based methods, the exploration of better strategies to extract information from unlabeled data for achieving unsupervised learning in RGB-D registration has gradually become a research focus."
"Enforcing the photometric and geometric consistency between the NeRF rerendering and the input frames can better optimize the estimated poses than the frame-to-frame methods, which enhances the learning signal for the registration model."