Core Concepts
Photo-SLAM is a novel SLAM framework that maintains a hyper primitives map to efficiently optimize tracking using a factor graph solver and learn the corresponding mapping by backpropagating the loss between the original images and rendering images. It introduces geometry-based densification and Gaussian-Pyramid-based learning to enhance online photorealistic mapping performance.
Abstract
The content presents Photo-SLAM, a novel SLAM framework that addresses the scalability and computational resource constraints of existing methods while achieving precise localization and online photorealistic mapping.
Key highlights:
Photo-SLAM maintains a hyper primitives map composed of point clouds storing ORB features, rotation, scaling, density, and spherical harmonic coefficients. This allows efficient optimization of tracking using a factor graph solver and learning of the corresponding mapping.
It introduces geometry-based densification to actively create additional hyper primitives based on inactive 2D feature points, and Gaussian-Pyramid-based learning to progressively acquire multi-level features, enhancing the mapping performance.
Extensive experiments demonstrate that Photo-SLAM significantly outperforms existing SOTA SLAMs for online photorealistic mapping, achieving state-of-the-art performance in terms of localization efficiency, mapping quality, and rendering speed, even on embedded platforms.
Stats
Photo-SLAM can render hundreds of photorealistic views in a resolution of 1200×680 per second with less than 5 GB GPU memory usage.
Quotes
"Photo-SLAM significantly outperforms current state-of-the-art SLAM systems for online photorealistic mapping, e.g., PSNR is 30% higher and rendering speed is hundreds of times faster in the Replica dataset."
"The Photo-SLAM can run at real-time speed using an embedded platform such as Jetson AGX Orin, showing the potential of robotics applications."