Core Concepts
This paper introduces CVQA, a novel benchmark dataset designed to evaluate the cultural awareness and linguistic diversity of visual question answering (VQA) models across 30 countries and 31 languages.
Romero, D., Lyu, C., Wibowo, H. A., Lynn, T., Hamed, I., Kishore, A. N., ... & Aji, A. F. (2024). CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark. Advances in Neural Information Processing Systems, 37.
This paper introduces a new benchmark dataset, CVQA, designed to address the limitations of existing VQA datasets that lack diversity in languages and cultural contexts. The authors aim to provide a challenging benchmark for evaluating the cultural capability and bias of multimodal models, particularly in understanding and reasoning across diverse images and texts.