RLHF-V addresses the issue of hallucinations in MLLMs by collecting human corrections on hallucinations and optimizing behavior alignment. It outperforms existing models in trustworthiness and robustness, showcasing promising data efficiency.
RLHF-V introduces a novel framework for behavior alignment in MLLMs using fine-grained correctional human feedback. The model significantly reduces hallucination rates and achieves state-of-the-art performance in trustworthiness among open-source MLLMs.
The framework collects segment-level corrections from human annotators to provide clear, dense, and fine-grained feedback for learning efficient behavior boundaries. RLHF-V shows better robustness than GPT-4V in preventing over-generalization-induced hallucinations.
Comprehensive experiments demonstrate that RLHF-V can substantially enhance the trustworthiness of MLLMs with promising data and computation efficiency. Using annotated data samples, RLHF-V significantly reduces object hallucination rates, surpassing concurrent models trained on more preference data.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Tianyu Yu,Yu... a las arxiv.org 03-11-2024
https://arxiv.org/pdf/2312.00849.pdfConsultas más profundas