Comprehensive Evaluation of Multimodal Large Language Models on High-Resolution Real-World Scenarios
Even the most advanced multimodal large language models struggle to achieve high performance on a new benchmark, MME-RealWorld, which features high-resolution images and challenging real-world scenarios.