A novel continual learning approach that automatically expands pre-trained vision transformers by adding modular adapters and representation descriptors to accommodate distribution shifts in incoming tasks, without the need for memory rehearsal.
The core message of this paper is that incrementally tuning the shared adapter without imposing parameter update constraints is an effective continual learning strategy for pre-trained vision transformers, and further performance improvements can be achieved by retraining a unified classifier with semantic shift-compensated prototypes.