Towards Helpful and Honest Remote Sensing Vision Language Model (H2RSVLM)
The authors constructed a large-scale high-quality and detailed caption dataset for remote sensing images (HqDC-1.4M) and the first Remote Sensing Self-Awareness (RSSA) dataset to enhance the helpfulness and honesty of remote sensing vision language models. Based on these datasets, they developed the Helpful and Honest Remote Sensing Vision Language Model (H2RSVLM) that achieves outstanding performance on multiple remote sensing tasks while being able to recognize and refuse to answer unanswerable questions.