Improving Automatic Evaluation of Factual Consistency in Generated Text by Leveraging Smaller but Cleaner Training Data
Leveraging a smaller but cleaner training dataset, the authors propose LIM-RA, an improved factual consistency evaluation model that outperforms the current state-of-the-art AlignScore across multiple benchmarks.