Evaluating the Groundedness of Long-form Outputs Generated by Retrieval-augmented Language Models
A significant fraction of sentences generated by retrieval-augmented language models, even those containing correct answers, are not grounded in the provided context or the models' pre-training data.