The paper introduces the SMART pruning algorithm, a novel approach for efficient block and output channel pruning on computer vision tasks. The key highlights are:
SMART pruner uses a separate, learnable probability mask to rank weight importance, rather than relying on weight magnitude alone. This enables more precise cross-layer weight importance ranking.
The algorithm employs a differentiable Top-k operator to iteratively adjust and redistribute the mask parameters, facilitating the soft probability mask to gradually converge to a binary mask.
To avoid convergence to non-sparse local minima, the SMART pruner utilizes a dynamic temperature parameter trick, where the temperature is gradually reduced during training to sharpen the differentiable Top-k function.
Theoretical analysis shows that as the temperature parameter approaches zero, the global optimum solution of the SMART pruner is equivalent to the global optimum solution of the fundamental pruning problem, mitigating the impact of regularization bias.
Extensive experiments demonstrate that the SMART pruner outperforms state-of-the-art pruning methods, including PDP, PaS, AWG, and ACDC, across a variety of computer vision tasks and models, including classification, object detection, and image segmentation.
The SMART pruner also exhibits superior performance on Transformer-based models in the context of N:M pruning, showcasing its adaptability and robustness across different neural network architectures and pruning types.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Guanhua Ding... at arxiv.org 04-01-2024
https://arxiv.org/pdf/2403.19969.pdfDeeper Inquiries