RegionGPT: Enhancing Region-Level Understanding with Vision Language Model
RegionGPT introduces a novel framework, RGPT, to enhance region-level captioning and understanding by refining visual features and integrating task-guided instruction prompts.