Core Concepts
Die Bewertung der Halluzination in großen Sprachmodellen basierend auf nicht beantwortbaren mathematischen Wortproblemen ist entscheidend für die Verbesserung der Modellleistung.
Stats
"Unanswerable questions can serve as a means to evaluate the degree of hallucination in LLMs, just as teachers often use unanswerable questions to gauge students’ understanding of certain concepts." - Rajpurkar et al. (2018)
"The identification process is described as follows: LLMs’ output is tokenized by the open-source tool Spacy." - Zhao et al. (2023)
"We adopt the F1 score as the metric for evaluating LLMs’ degree of hallucination." - Research findings
Quotes
"Large language models (LLMs) are highly effective in various natural language processing (NLP) tasks." - Abstract
"We show that utilizing MWP is a reliable and effective approach to assess hallucination." - Research findings