Core Concepts
Leveraging logic programming and metamorphic testing, HalluVault automatically generates diverse and reliable test cases to detect fact-conflicting hallucinations in large language models.
Abstract
The paper introduces HalluVault, a novel framework that leverages logic programming and metamorphic testing to automatically generate diverse and reliable test cases for detecting fact-conflicting hallucinations (FCH) in large language models (LLMs).
Key highlights:
Factual Knowledge Extraction: HalluVault extracts fundamental facts from knowledge databases into fact triples that can be utilized for logical reasoning.
Logical Reasoning: HalluVault uses five types of logic reasoning rules (negation, symmetric, inverse, composition, and transitivity) to automatically generate new factual knowledge from the extracted facts.
Benchmark Construction: HalluVault creates high-quality test case-oracle pairs from the newly-derived ground truth knowledge. The test oracles are generated based on a metamorphic relation - questions complying with the knowledge should be answered with "YES" and questions contravening the knowledge should be answered with "NO".
Response Evaluation: HalluVault evaluates the responses from LLMs and detects factual consistency automatically. It constructs semantic-aware structures from the LLM outputs and assesses their similarity to the ground truth using metamorphic testing.
The evaluation of HalluVault on six different LLMs across nine domains reveals hallucination rates ranging from 24.7% to 59.8%. The results highlight the challenges LLMs face with temporal concepts, out-of-distribution knowledge, and logical reasoning capabilities. The authors also investigate model editing techniques to mitigate the identified FCHs, demonstrating promising results on a limited scale.
Stats
LLMs can generate hallucination responses ranging from 24.7% to 59.8% across different domains.
LLMs struggle particularly with handling temporal concepts and out-of-distribution knowledge.
LLMs exhibit deficiencies in logical reasoning capabilities, which contributes the most to the FCH issues.
Quotes
"Fact-Conflicting Hallucination (FCH) occurs when LLMs generate content that directly contradicts established facts."
"The key to determining if an LLM has produced an FCH lies in assessing whether the overall logical reasoning behind its answer is consistent with the established ground truth."
"Test cases generated using our logical reasoning rules can effectively trigger and detect hallucination issues in LLMs."