This paper investigates the impact of neural network architectural designs on continual learning (CL) performance. The authors systematically explore the effects of network width, depth, and components (skip connections, global pooling, and down-sampling) on both Task Incremental Learning (Task IL) and Class Incremental Learning (Class IL).
The key findings are:
Extensive experiments demonstrate that the ArchCraft-guided architectures achieve state-of-the-art CL performance while being significantly more parameter-efficient than the baseline architectures. For example, ResAC-A outperforms ResNet-18 by up to 8.19% in last accuracy and 8.02% in average incremental accuracy, with 23% fewer parameters.
The authors further analyze the stability and plasticity of the ArchCraft-guided architectures, showing that they exhibit less forgetting on previous tasks and higher accuracy on new tasks compared to the baselines. This is attributed to the ArchCraft architectures' ability to extract more shared features across incremental tasks, as evidenced by the higher similarity of their representations.
In summary, this work highlights the critical role of network architecture design in continual learning and proposes the ArchCraft method as an effective approach to craft CL-friendly architectures that balance stability and plasticity.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Aojun Lu,Tao... at arxiv.org 04-24-2024
https://arxiv.org/pdf/2404.14829.pdfDeeper Inquiries