The paper proposes a hybrid visual odometry (VO) framework that utilizes pose-only supervision to address the limitations of traditional geometry-based and deep learning-based VO methods. The key contributions are:
Self-supervised homographic pre-training: This pre-training phase empowers the network to refine its optical flow estimation capabilities and bolster feature representations from just one image, proving advantageous for the subsequent sparse optical flow-based VO tasks that depend exclusively on pose supervision.
Salient patch detection and refinement: A salient patch detection module identifies points with significant image features, retaining valuable patches while discarding unnecessary ones. A salient patch refining training step further enhances the network's cooperation with salient patches, improving accuracy and reliability, particularly in monotonous environments.
The experiments show that the pose-only supervised method achieves competitive results on standard datasets and greater robustness and generalization ability in extreme and unseen scenarios, even compared to dense optical flow-supervised state-of-the-art methods. The live experiment in a meeting room with significant illumination changes demonstrates the superior robustness and generalization of the proposed approach.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Siyu Chen,Ka... at arxiv.org 04-09-2024
https://arxiv.org/pdf/2404.04677.pdfDeeper Inquiries