The author establishes NoMIRACL, a dataset for evaluating LLM robustness in multilingual retrieval-augmented generation. The study highlights challenges in LLM performance and the need for improved robustness.
LLMs struggle to balance hallucination and error rates in multilingual retrieval-augmented generation.
NoMIRACL evaluates LLM robustness in multilingual retrieval-augmented generation, highlighting challenges and model performance.
LLM Robustness Evaluation in Multilingual Retrieval-Augmented Generation