Towards Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
The proposed Hybrid memory for Temporally consistent Referring video object segmentation (HTR) paradigm explicitly models temporal instance consistency alongside referring segmentation, achieving top-ranked performance on benchmark datasets.