Core Concepts
The author evaluates multilingual LLMs' factual accuracy using a novel pipeline and highlights geographical biases in fact generation.
Abstract
The study assesses multilingual LLMs' factual accuracy, revealing a Western-centric bias. English outperforms other languages in generating correct facts. The research questions focus on uniform factual accuracy across languages and precision alignment with language. The methodology involves task selection, model usage, prompt translation, and factuality measurement. Results show significant variations in factuality across languages and geographic regions, emphasizing the need for enhanced assessment methods.
Stats
English consistently maintains an advantage in both factual accuracy and quantity of generated facts compared to other languages.
Across analyzed languages, America and Europe are the primary focal points for accurate outputs.
Languages exhibit preferential accuracy towards regions where they are predominantly spoken.