Core Concepts
The author introduces pseudo-depth maps to improve instance segmentation performance by capturing depth differences between instances.
Abstract
The research addresses the limitations of box supervision in distinguishing foreground from background within target boxes. By incorporating pseudo-depth maps and a depth prediction layer, the network can simultaneously predict masks and depth, leading to significant improvements on datasets like Cityscapes and COCO.
The study explores the impact of pseudo-depth maps in weakly supervised instance segmentation tasks. By integrating depth features into mask predictions and utilizing depth consistency losses, the network achieves more accurate segmentations. The self-distillation process further refines the model by selecting reliable pseudo masks based on depth matching scores.
The proposed method leverages coarse pseudo-depth maps during training to enhance instance segmentation results. By fusing a depth estimation layer into the mask prediction head and incorporating a depth consistency loss, the network can better perceive instance-level depth features for improved segmentation performance.
Stats
The proposed method achieves 2.7% mask AP improvement with ResNet50 on Cityscapes.
Achieved 41.0% mask AP with Swin-Base on COCO dataset.