The paper proposes a novel architectural unit called Pyramid Pixel Context Adaption (PPCA) module to enhance the representational capability of convolutional neural networks (CNNs) for medical image classification tasks.
The key highlights are:
PPCA exploits multi-scale pixel context information through a cross-channel pyramid pooling method, which is the first to aggregate and leverage multi-scale pixel context information for spatial attention design. This is in contrast to existing spatial attention methods that only utilize single-scale pixel context information.
PPCA introduces a pixel normalization operator to eliminate the inconsistency of multi-scale pixel context features at the same pixel positions, stabilizing their distribution for more effective pixel-level recalibration.
PPCA adaptively fuses the normalized multi-scale pixel context features to generate pixel-wise attention weights, enabling the network to dynamically focus on informative pixel positions and suppress trivial ones in a pixel-independent manner.
The PPCA module is combined with modern CNN architectures to construct the PPCANet for medical image classification. Additionally, the authors introduce supervised contrastive learning to further boost the performance by exploiting label information.
Extensive experiments on six medical image datasets demonstrate the superiority of PPCANet over state-of-the-art attention-based networks and recent deep neural networks, especially in highlighting subtle lesion regions. Visual analysis and ablation studies are provided to explain the inherent behavior of PPCA in the decision-making process.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Xiaoqing Zha... at arxiv.org 05-03-2024
https://arxiv.org/pdf/2303.01917.pdfDeeper Inquiries