The paper presents SPVLoc, a global indoor localization method that accurately determines the six-dimensional (6D) camera pose of a query image without requiring scene-specific prior knowledge or training.
The key highlights are:
SPVLoc employs a novel matching procedure to localize the perspective camera's viewport within a set of panoramic semantic layout representations of the indoor environment. The panoramas are rendered from an untextured 3D reference model containing approximate structural information and semantic annotations.
A convolutional network is used to achieve image-to-panorama matching and ultimately image-to-model matching. The network predicts the 2D bounding box around the viewport and classifies it, allowing the best matching panorama to be selected.
The exact 6D pose is then estimated through relative pose regression starting from the selected panorama's position. This approach bridges the domain gap between real images and synthetic panoramas, enabling generalization to previously unseen scenes.
Experiments on public datasets demonstrate that SPVLoc outperforms state-of-the-art methods in localization accuracy while estimating more degrees of freedom of the camera pose.
The method's performance is further analyzed through ablation studies, examining the impact of factors like grid size, focal length, and camera rotation angles. The results show the flexibility and robustness of the approach.
לשפה אחרת
מתוכן המקור
arxiv.org
תובנות מפתח מזוקקות מ:
by Niklas Gard,... ב- arxiv.org 04-17-2024
https://arxiv.org/pdf/2404.10527.pdfשאלות מעמיקות