Teaching Multimodal Large Language Models with Faithful, Concise, and Transferable Rationales
Fact, a novel paradigm that generates multimodal rationales that are faithful, concise, and transferable for teaching Multimodal Large Language Models (MLLMs), enhancing their compositional reasoning and generalization abilities.