The paper introduces ASSNet, a transformer-based architecture designed for accurate medical image segmentation. The key highlights are:
ASSNet combines the strengths of ResUNet and Swin Transformer, incorporating window attention, spatial attention, U-shaped architecture, and residual connections to enable efficient segmentation.
The proposed Adaptive Feature Fusion (AFF) Decoder maximizes the synergistic potential of window attention to capture multi-scale local and global information by fusing feature maps of varying scales. This includes the Long Range Dependencies (LRD) block, the Multi-Scale Feature Fusion (MFF) block, and the Adaptive Semantic Center (ASC) block.
Comprehensive experiments on liver tumor, bladder tumor, and multi-organ segmentation datasets demonstrate that ASSNet achieves new state-of-the-art results, outperforming previous methods by a significant margin. The model excels at segmenting small and irregularly shaped tumors, as well as miniature organs, which are challenging for other approaches.
The ablation study confirms the importance of each component in ASSNet, highlighting the crucial role of long-range dependency modeling, multi-scale feature fusion, and edge detection in achieving high-performance medical image segmentation.
Egy másik nyelvre
a forrásanyagból
arxiv.org
Mélyebb kérdések