Leveraging unlabeled large language model generations in the wild to effectively detect hallucinated content through an automated membership estimation approach.
Detecting a specific subclass of hallucinations, termed confabulations, in large language models to address the problem of factually incorrect or irrelevant responses.
Large language models often generate false or unsubstantiated outputs, known as "hallucinations", which prevent their adoption in critical domains. This work proposes a general method to detect a subset of hallucinations, called "confabulations", by estimating the semantic entropy of model outputs.
The Hallucinations Leaderboard is an open initiative to quantitatively measure and compare the tendency of large language models to produce hallucinations - outputs that do not align with factual reality or the input context.
Hallucinations in large language models can be effectively detected by analyzing the model's internal state transition dynamics during generation using tractable probabilistic models.
The SHROOM-INDElab system uses prompt engineering and in-context learning with large language models (LLMs) to build classifiers for hallucination detection, achieving competitive performance in the SemEval-2024 Task 6 competition.
KnowHalu proposes a two-phase process for detecting hallucinations in text generated by large language models (LLMs). The first phase identifies non-fabrication hallucinations, while the second phase performs multi-form knowledge-based factual checking to detect fabrication hallucinations.