Enhancing Temporal Comprehension for Referring Video Segmentation through Decoupled Static and Hierarchical Motion Perception
Decoupling static and motion perception, with a specific emphasis on enhancing temporal comprehension, improves referring video segmentation performance.