The article examines the fundamental assumptions behind the use of hierarchical label structures in semantic segmentation. It starts by observing that the empirical benefits reported in prior work may not be related to the semantic definition of the label hierarchy. Through a series of cross-domain experiments, the authors reveal that a flat segmentation network, in which the parent categories are inferred from the children, outperforms the more sophisticated hierarchical approaches consistently, especially for the parent nodes.
The authors identify an inherent bias of flat classifiers in the Euclidean space, where the distance between decision boundaries and class embeddings is non-uniform. This parent bias can lead to suboptimal accuracy on the parent-level classification. To mitigate this issue, the authors propose to model the pixel features in the hyperbolic Poincaré ball, which exhibits more uniform properties between class representations.
The experimental results show that the hyperbolic flat classifier outperforms the Euclidean counterpart and the hierarchical approach in terms of segmentation accuracy and calibration, especially on the more challenging test domains. The authors conclude that the Poincaré ball model generalizes surprisingly well across unseen test domains, suggesting that the established practice of hierarchical segmentation may be limited to in-domain settings.
To Another Language
from source content
arxiv.org
Deeper Inquiries