Core Concepts
Densely Connected Convolutional Networks (DenseNets) can outperform modern architectures like ResNets, Swin Transformers, and ConvNeXts by leveraging the underrated effectiveness of feature concatenation over additive shortcuts.
Abstract
The paper revives Densely Connected Convolutional Networks (DenseNets) and demonstrates their potential that was previously overlooked. Through a comprehensive pilot study, the authors validate that feature concatenation can surpass additive shortcuts used in prevalent architectures like ResNets.
The authors then modernize DenseNet with a more memory-efficient design, abandoning ineffective components and enhancing architectural and block designs, while preserving the essence of dense connectivity via concatenation. Their methodology, dubbed Revitalized DenseNet (RDNet), ultimately exceeds the performance of strong modern architectures like Swin Transformer, ConvNeXt, and DeiT-III on ImageNet-1K. RDNet also exhibits competitive performance on downstream tasks such as ADE20K semantic segmentation and COCO object detection/instance segmentation.
Notably, RDNet does not exhibit slowdown or degradation as the input size increases, unlike width-oriented networks that struggle with larger intermediate tensors. The authors provide empirical analyses that shed light on the unique benefits of concatenation over additive shortcuts.
Stats
The number of parameters ranges from 24M to 186M across the RDNet model family.
The FLOPs range from 5.0G to 34.7G.
Inference latency ranges from 7.4ms to 933.7ms for a batch size of 128.
Memory usage ranges from 4.1GB to 10.9GB for a batch size of 16.
Quotes
"Concatenation shortcut is an effective way of increasing rank."
"A strategic design mitigates memory concerns."