Enhancing the Performance of Diverse Multimodal Large Language Models through Transferable Visual Prompting
Transferable Visual Prompting (TVP) can effectively improve the performance of diverse Multimodal Large Language Models (MLLMs) on a wide range of tasks by optimizing a set of shared visual prompts.