RLHF-V addresses the issue of hallucinations in MLLMs by collecting human corrections on hallucinations and optimizing behavior alignment. It outperforms existing models in trustworthiness and robustness, showcasing promising data efficiency.
RLHF-V introduces a novel framework for behavior alignment in MLLMs using fine-grained correctional human feedback. The model significantly reduces hallucination rates and achieves state-of-the-art performance in trustworthiness among open-source MLLMs.
The framework collects segment-level corrections from human annotators to provide clear, dense, and fine-grained feedback for learning efficient behavior boundaries. RLHF-V shows better robustness than GPT-4V in preventing over-generalization-induced hallucinations.
Comprehensive experiments demonstrate that RLHF-V can substantially enhance the trustworthiness of MLLMs with promising data and computation efficiency. Using annotated data samples, RLHF-V significantly reduces object hallucination rates, surpassing concurrent models trained on more preference data.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Tianyu Yu,Yu... lúc arxiv.org 03-11-2024
https://arxiv.org/pdf/2312.00849.pdfYêu cầu sâu hơn