Large-scale Long-term Video Object Segmentation Benchmark: Evaluating the Limitations of Existing Models in Real-world Scenarios
The LVOS benchmark highlights the significant challenges faced by existing video object segmentation models when dealing with long-term videos, which are more representative of real-world scenarios. The performance of these models suffers a large drop on LVOS compared to short-term video datasets, emphasizing the need for more robust and capable VOS models to handle the complexities of long-term videos.