Automating Computational Reproducibility: Evaluating AI Agents on a Benchmark for Reproducing Published Research
Automating the computational reproducibility of published research is a crucial yet challenging task that can significantly improve the credibility of scientific findings.