Multilingual Visual Question Answering: A Challenging Benchmark for Evaluating Cross-Language AI Systems
The EVJVQA dataset provides a challenging benchmark for evaluating multilingual visual question answering systems, covering three languages - Vietnamese, English, and Japanese - on images from Vietnam. The dataset aims to motivate research on developing effective cross-language AI models that can understand visual content and answer questions in diverse languages.