Differentiable pruning combined with combinatorial optimization enhances structured neural network pruning.
SPA is a versatile structured pruning framework that can prune neural networks with any architecture, from any framework, and at any stage of training, achieving state-of-the-art pruning results without the need for fine-tuning or calibration data.
The authors propose a novel method for structured pruning of pre-trained deep neural networks that involves projecting unit activations to an orthogonal subspace, ranking units based on their non-redundant variance, and using a global variance-based cutoff to automatically determine layer-wise pruning ratios.