The SHROOM-INDElab team participated in the SemEval-2024 Task 6 competition, which focused on hallucination detection in the outputs of language models. The team developed a two-stage system that leverages prompt engineering and in-context learning with LLMs to classify whether a given model output contains hallucination or not.
In the first stage, the system uses a zero-shot approach, where the LLM is prompted with task, role, and concept definitions to classify the data points without any examples. The classified data points from this stage are then used to select a few-shot example set for the second stage.
In the second stage, the system uses the selected examples along with the task, role, and concept definitions to prompt the LLM for a few-shot classification. The team experimented with different hyperparameters, such as temperature, number of examples, and number of samples, to optimize the classifier's performance.
The SHROOM-INDElab system achieved competitive results, ranking fourth and sixth in the model-agnostic and model-aware tracks of the competition, respectively. The system's classifications were also found to be consistent with the crowd-sourced human labelers, as indicated by the Spearman's correlation coefficient.
The team's ablation study revealed that the explicit definition of the hallucination concept was a crucial component of the system's performance, suggesting the importance of including intentional definitions of concepts in prompts for LLM-based classifiers. The team plans to further investigate this approach for evaluating natural language rationale generation in the context of zero- and few-shot chain-of-thought classifiers.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania