RGPT enhances region-level captioning and understanding by refining visual features and integrating task-guided instruction prompts.
RegionGPT enhances region-level captioning and understanding by refining visual features and integrating task-guided instruction prompts.