Length-Controlled AlpacaEval: A Simple Approach to Mitigate Biases in Automated Evaluations of Chatbot Language Models
A simple regression-based approach to control for length bias in the AlpacaEval automated evaluation metric, resulting in a more robust and accurate measure of chatbot performance.