Core Concepts
Our method presents the first near real-time SLAM system that uses 3D Gaussian Splatting as the sole underlying scene representation, enabling high-fidelity reconstruction from monocular input.
Abstract
The paper presents a novel SLAM system that uses 3D Gaussian Splatting (3DGS) as the sole underlying scene representation. This enables high-fidelity 3D reconstruction, even from monocular input, by leveraging the continuous and differentiable nature of the Gaussian representation.
Key highlights:
Formulates camera tracking for 3DGS using direct optimization against the 3D Gaussians, enabling fast and robust tracking.
Introduces geometric verification and regularization to handle ambiguities in incremental 3D dense reconstruction.
Develops a full SLAM system that achieves state-of-the-art results in novel view synthesis, trajectory estimation, and reconstruction of tiny and transparent objects.
Demonstrates superior performance compared to other rendering-based SLAM methods, particularly in real-world scenarios.
Can be easily extended to RGB-D SLAM when depth measurements are available.
The system maintains a 3D Gaussian map of the scene, continuously optimizing the Gaussian parameters to represent the observed geometry and appearance. Camera poses are optimized by direct alignment against the 3D Gaussian map, without the need for explicit depth estimation or other pre-trained components.
The authors introduce several key innovations to enable this approach, including analytic Jacobians for efficient camera pose optimization, geometric regularization of the Gaussian shapes, and a resource allocation and pruning method to maintain a clean and consistent geometric representation.
Extensive evaluations on both monocular and RGB-D datasets demonstrate the system's ability to achieve state-of-the-art performance in camera tracking, mapping, and novel view synthesis, while offering significantly faster rendering speeds compared to other methods.
Stats
"We reconstruct a high fidelity 3D scene live at 3fps."
"Our system significantly advances the fidelity a live monocular SLAM system can capture."
Quotes
"We present the first application of 3D Gaussian Splatting in monocular SLAM, the most fundamental but the hardest setup for Visual SLAM."
"Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera."