Core Concepts
Vision Foundation Models (VFMs) serve as robust backbones for Domain Generalized Semantic Segmentation (DGSS), achieving superior generalizability with fewer trainable parameters.
Stats
VFMs like CLIP, SAM, DINOv2 achieve mIoU of 65.0, 60.0, 66.0, respectively.
Rein achieves an mIoU of 68.1% on Cityscapes with just an extra 1% of trainable parameters.
Quotes
"Vision Foundation Models serve as robust backbones for Domain Generalized Semantic Segmentation."
"Rein significantly outperforms state-of-the-art methods in DGSS tasks."