RLHF-V addresses the issue of hallucinations in MLLMs by collecting human corrections on hallucinations and optimizing behavior alignment. It outperforms existing models in trustworthiness and robustness, showcasing promising data efficiency.
RLHF-V introduces a novel framework for behavior alignment in MLLMs using fine-grained correctional human feedback. The model significantly reduces hallucination rates and achieves state-of-the-art performance in trustworthiness among open-source MLLMs.
The framework collects segment-level corrections from human annotators to provide clear, dense, and fine-grained feedback for learning efficient behavior boundaries. RLHF-V shows better robustness than GPT-4V in preventing over-generalization-induced hallucinations.
Comprehensive experiments demonstrate that RLHF-V can substantially enhance the trustworthiness of MLLMs with promising data and computation efficiency. Using annotated data samples, RLHF-V significantly reduces object hallucination rates, surpassing concurrent models trained on more preference data.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询