Visual Prompting in Multimodal Large Language Models: A Comprehensive Survey
This paper presents a comprehensive survey on visual prompting methods in multimodal large language models (MLLMs), covering visual prompt generation, integration into MLLM perception and reasoning, and model alignment techniques.