The study proposes a simplified and adaptable approach to improve depth estimation accuracy using transfer learning and an optimized loss function. The optimized loss function is a combination of weighted losses - Mean Absolute Error (MAE), Edge Loss, and Structural Similarity Index (SSIM) - to enhance robustness and generalization.
The authors explore multiple encoder-decoder-based models including DenseNet121, DenseNet169, DenseNet201, and EfficientNet for the supervised depth estimation task on the NYU Depth Dataset v2. They observe that the EfficientNet model, pre-trained on ImageNet for classification, when used as an encoder with a simple upsampling decoder, gives the best results in terms of RSME, REL and log10.
The authors also perform a qualitative analysis which illustrates that their model produces depth maps that closely resemble ground truth, even in cases where the ground truth is flawed. The results indicate significant improvements in accuracy and robustness, with EfficientNet being the most successful architecture.
In eine andere Sprache
aus dem Quellinhalt
arxiv.org
Wichtige Erkenntnisse aus
by Muhammad Ade... um arxiv.org 04-12-2024
https://arxiv.org/pdf/2404.07686.pdfTiefere Fragen