The paper proposes a hybrid visual odometry (VO) framework that utilizes pose-only supervision to address the limitations of traditional geometry-based and deep learning-based VO methods. The key contributions are:
Self-supervised homographic pre-training: This pre-training phase empowers the network to refine its optical flow estimation capabilities and bolster feature representations from just one image, proving advantageous for the subsequent sparse optical flow-based VO tasks that depend exclusively on pose supervision.
Salient patch detection and refinement: A salient patch detection module identifies points with significant image features, retaining valuable patches while discarding unnecessary ones. A salient patch refining training step further enhances the network's cooperation with salient patches, improving accuracy and reliability, particularly in monotonous environments.
The experiments show that the pose-only supervised method achieves competitive results on standard datasets and greater robustness and generalization ability in extreme and unseen scenarios, even compared to dense optical flow-supervised state-of-the-art methods. The live experiment in a meeting room with significant illumination changes demonstrates the superior robustness and generalization of the proposed approach.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Siyu Chen,Ka... a las arxiv.org 04-09-2024
https://arxiv.org/pdf/2404.04677.pdfConsultas más profundas