Khái niệm cốt lõi
This paper introduces SG-BEV, a novel approach that leverages satellite imagery and street-view data to achieve precise cross-view semantic segmentation of fine-grained building attributes, overcoming the limitations of existing methods in capturing complete building facade features and addressing uneven feature distribution issues.
Tóm tắt
The paper aims to achieve fine-grained building attribute segmentation in a cross-view scenario, using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views.
The key highlights and insights are:
The authors innovatively apply the Bird's Eye View (BEV) paradigm to the task of cross-view semantic segmentation, establishing a clear spatial mapping relationship from the street-view to the satellite perspective.
They develop a Satellite-Guided Reprojection (SGR) module to address the issue of features being unevenly concentrated at the edges of buildings in traditional BEV methods.
The proposed SG-BEV method demonstrates significant improvements on four cross-view datasets, achieving an increase in mIOU by 10.13% and 5.21% compared to the state-of-the-art satellite-based and cross-view methods, respectively.
The authors show that their approach is more effective in predicting fine-grained building attributes, such as floor counts, compared to using satellite imagery alone, highlighting the efficacy of integrating street-view data.
Comprehensive experiments are conducted on four cross-view datasets from cities including New York, San Francisco, and Boston, demonstrating the robustness and generalization capabilities of the proposed method.
Thống kê
The paper reports the following key metrics:
On average across the four datasets, the proposed SG-BEV method achieves an increase in mIOU by 10.13% and 5.21% compared to the state-of-the-art satellite-based and cross-view methods, respectively.