To mitigate the hallucination problem in large vision-language models, the authors propose an innovative method called Fine-Grained Artificial Intelligence Feedback (FGAIF) that aligns the text and image modalities through fine-grained feedback, including object existence, object attribute, and object relationship hallucinations.


coremsg

aligning-large-vision-language-models-with-fine-grained-ai-feedback-to-mitigate-hallucinations


Aligning Large Vision-Language Models with Fine-Grained AI Feedback to Mitigate Hallucinations