Alapfogalmak
The author introduces the Dynamic Backward Attention Transformer (DBAT) for material segmentation, utilizing cross-resolution patches dynamically to improve performance.
Kivonat
The paper presents the DBAT model for material segmentation, achieving high accuracy and interpretability. It addresses challenges in combining material and contextual features, showcasing superior performance compared to state-of-the-art models.
Recent studies adopt image patches to extract material features. The DBAT model aggregates cross-resolution features dynamically by merging adjacent patches at each transformer stage. Experiments show that the DBAT achieves an accuracy of 86.85%, outperforming other real-time models.
The DBAT model is more robust to network initialization and yields fewer variable predictions compared to other models. The attention-based residual connection guides the aggregation of cross-resolution features effectively.
The study evaluates the DBAT on two datasets, LMD and OpenSurfaces, reporting superior performance in material segmentation tasks. The network dissection method reveals that the DBAT excels in extracting material-related features like texture.
Statisztikák
Experiments show that our DBAT achieves an accuracy of 86.85%
The Pixel Acc represents a 21.21% increase compared to previous publications.
The average pixel accuracy (Pixel Acc) reaches 86.85% when assessed on the LMD.
The CKA similarity score between Swin and DBAT is 0.9583.