RORA proposes a robust evaluation method for free-text rationales, addressing the challenge of label leakage and providing more reliable measurements aligned with human judgment.
RORA quantifies new information in rationales, addressing label leakage for robust evaluation.