Evaluating the Factual Accuracy of Large Language Models Using Comprehensive Knowledge Graphs
Large language models (LLMs) may generate factually incorrect responses, a phenomenon known as hallucination. This paper proposes GraphEval, a framework that efficiently evaluates the factuality of LLMs using a large-scale knowledge graph containing over 10 million facts.