The core message of this paper is to propose a geometrically consistent cost aggregation scheme that leverages local geometric smoothness and surface normals to better utilize adjacent geometries, leading to improved multi-view stereo reconstruction performance.
CHOSEN, a simple yet flexible, robust and effective multi-view depth refinement framework, iteratively re-samples and selects the best depth hypotheses using contrastive learning, and automatically adapts to different metric or intrinsic scales determined by the capture system.