Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
Instruction-following models can perform question answering by leveraging provided text passages, but their verbose responses make traditional evaluation metrics unreliable. This work proposes evaluating these models along two dimensions - correctness in satisfying the user's information need, and faithfulness in grounding the response in the provided knowledge.