The paper proposes a novel approach for learning-based multi-view stereo (MVS) reconstruction. It first analyzes the properties of existing loss functions, including regression-based loss and classification-based loss, and identifies their limitations.
To address these issues, the paper introduces two key components:
Adaptive Wasserstein Loss: The authors propose a novel loss function, named adaptive Wasserstein loss, which is able to measure the divergence between the true and predicted depth distributions that may not have any common supports. This is in contrast to the Kullback-Leibler divergence used in classification-based methods, which becomes invalid when the distributions do not overlap.
Offset Module: The authors also propose a simple but effective offset module that is added to the end of the network. This module predicts an offset value for each discrete depth value, allowing the network to output continuous depth values by combining the mode of the discrete depth probabilities and the predicted offsets. This helps to achieve sub-pixel depth accuracy.
The proposed method with the adaptive Wasserstein loss and offset module is evaluated on several benchmark datasets, including DTU, Tanks and Temples, and BlendedMVS. The results demonstrate that the method achieves state-of-the-art performance, outperforming many existing learning-based MVS approaches.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問