Core Concepts
HALC is a novel decoding algorithm designed to reduce object hallucinations in large vision-language models by integrating adaptive focal-contrast grounding and specialized beam search.
Abstract
Large vision-language models (LVLMs) suffer from object hallucinations (OH).
HALC corrects hallucinated tokens using a focal-contrast grounding mechanism and beam search.
HALC outperforms existing methods in reducing OH across benchmarks.
HALC can be easily integrated into LVLMs without extra training.
Experimental studies demonstrate HALC's effectiveness in reducing OH.
Stats
HALC는 대형 시각-언어 모델에서 오브젝트 환각을 줄이기 위한 새로운 디코딩 알고리즘입니다.
HALC는 오브젝트 환각을 수정하기 위해 적응형 초점-대조 그라운딩 메커니즘과 특수 빔 서치를 통합합니다.
HALC는 다양한 벤치마크에서 기존 방법을 능가하는 성능을 보입니다.
Quotes
"HALC leverages distinct fine-grained optimal visual information in vision-language tasks."
"HALC can be integrated into any LVLMs as a plug-and-play module without extra training."