核心概念
Referring perception models need to be robust against various perturbations for reliable real-world applications.
摘要
R2-Bench evaluates the robustness of referring perception models against perturbations like environmental noise, human-induced errors, and sensor limitations. The benchmark assesses performance across tasks like image segmentation, video object segmentation, audiovisual segmentation, and 3D mapping. It introduces R2-Agent, an LLM-based assistant for model evaluation automation.
统计
RPMs' performance can be compromised by disturbances in real-world scenarios.
Conducting a rigorous analysis of RPMs’ robustness is necessary for building reliable applications.
R2-Bench features a taxonomy of perturbations and a toolbox for synthesis and evaluation.
The benchmark includes tasks like referring image segmentation and audiovisual segmentation.
引用
"RPMs’ performance can be compromised by disturbances in real-world scenarios."
"Conducting a rigorous analysis of RPMs’ robustness is necessary for building reliable applications."