D$^3$epth enhances self-supervised monocular depth estimation in dynamic scenes by introducing a Dynamic Mask to handle inconsistencies caused by moving objects and a Cost Volume Auto-Masking strategy with a Spectral Entropy Uncertainty module to improve multi-frame depth estimation.
This paper proposes a novel adversarial training framework called SCAT to enhance the generalization ability of self-supervised monocular depth estimation models, addressing the instability issues caused by sensitive network architectures and conflicting optimization gradients.
The proposed CCDepth network leverages convolutional neural networks (CNNs) and the white-box CRATE transformer to efficiently extract local and global features, enabling lightweight and interpretable depth estimation.
A self-supervised monocular depth estimation network that utilizes large kernel attention to model long-distance dependencies while maintaining feature channel adaptivity, and an upsampling module to accurately recover fine details in the depth map, achieving competitive performance on the KITTI dataset.
Manydepth2 leverages optical flow and coarse depth information to construct a motion-guided cost volume, enabling precise depth estimation for both dynamic objects and static backgrounds in an efficient manner.
Introducing SPIdepth, a novel self-supervised approach that significantly improves monocular depth estimation by focusing on the refinement of the pose network, leading to substantial advancements in depth prediction accuracy.
EC-Depth introduces a novel two-stage framework for robust depth estimation in challenging scenarios.
Self-supervised learning with diverse datasets from YouTube enables zero-shot generalization in monocular depth estimation, outperforming existing approaches and even supervised methods.