The content discusses the problem of hallucination in large language models (LLMs) and presents a framework called "Sorry, Come Again" (SCA) to address it. Key highlights:
Investigates the impact of linguistic features (readability, formality, concreteness) of prompts on hallucination across 21 LLMs. Prompts with lower readability, formality, or concreteness pose comprehension challenges for LLMs, leading to hallucination.
Introduces an optimal paraphrasing technique to identify the most comprehensible paraphrase of a given prompt, evaluated using Integrated Gradient and its variations.
Proposes injecting [PAUSE] tokens to delay LLM generation and aid comprehension. Determines the optimal position and number of [PAUSE] tokens based on the abstractness of the prompt.
Introduces a novel fine-tuning approach called "Reverse Proxy-Tuning" to efficiently fine-tune LLMs with [PAUSE] tokens.
Presents ACTIVATOR, an end-to-end framework that selects the optimal paraphrased prompt and evaluates the generated content for hallucination using textual entailment.
The study demonstrates that enhancing LLM comprehension through optimal paraphrasing and [PAUSE] injection can effectively reduce hallucination in generated content.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Vipula Rawte... at arxiv.org 03-29-2024
https://arxiv.org/pdf/2403.18976.pdfDeeper Inquiries