Core Concepts
Structured model pruning can effectively compress deep learning models for computational pathology applications while maintaining negligible performance loss.
Abstract
The paper investigates the use of model pruning techniques to reduce the computational cost and memory footprint of deep learning models used in computational pathology applications, without significantly compromising their performance.
The key highlights are:
The authors propose a pruning strategy for U-Net style encoder-decoder architectures, which are widely used in biomedical imaging tasks. This addresses the challenge of maintaining the structural integrity of shortcut connections between the encoder and decoder layers during pruning.
They evaluate multiple pruning heuristics, including L1-norm, L2-norm, and network slimmer, on two tasks: nuclei instance segmentation and classification, and colorectal cancer tissue classification.
For nuclei instance segmentation and classification using the HoverNet model, the authors show that iterative pruning with L2-norm heuristic can compress the model by up to 90% in terms of parameters, while reducing the inference latency by 80% with negligible performance drop.
For the smaller ResNet18 model on colorectal cancer tissue classification, the authors demonstrate that one-shot and iterative pruning can achieve similar performance, with up to 75% model compression and 3x inference speedup.
The results suggest that large models are not necessarily required for reliable and effective inference in digital and computational pathology applications, and the pruned models can enable the deployment of AI in resource-constrained clinical settings and on edge devices.
Stats
Pruning up to 90% of the HoverNet model parameters resulted in 80% reduction in inference latency.
Pruning the ResNet18 model for colorectal cancer tissue classification achieved up to 75% model compression and 3x inference speedup.
Quotes
"Structured model pruning can effectively compress deep learning models for computational pathology applications while maintaining negligible performance loss."
"The pruned models, compact and efficient, may enable the deployment of AI in resource-constrained clinical sites and onto edge devices, for example, enabling AI on whole-slide scanners."