대규모 언어 모델의 지속적인 학습에서 발생하는 치명적 망각 문제를 해결하기 위해 중요 어텐션 헤드를 선택적으로 증류하여 데이터 효율성을 높이는 새로운 리플레이 기반 증류 방법, SEEKR을 소개합니다.
SEEKR is a novel continual learning method for large language models that addresses catastrophic forgetting by selectively distilling knowledge from important attention heads, resulting in improved data efficiency and performance.
Online-LoRA is a novel method that enables continual learning for vision transformers in task-free online settings by leveraging low-rank adaptation and online weight regularization to mitigate catastrophic forgetting and adapt to evolving data streams.
본 논문에서는 여러 작업에서 학습된 모델의 희소 직교 매개변수를 병합하면 기존 지식을 잊어버리지 않고 새로운 작업에 적응하는 지속적인 학습(Continual Learning)에서 뛰어난 성능을 발휘한다는 것을 보여줍니다.
Merging sparsely updated parameter deltas from fine-tuned Vision Transformers, guided by the principle of orthogonality, effectively combats catastrophic forgetting in continual learning tasks.
SAFE, a novel continual learning framework, leverages the strengths of both slow and fast learners with parameter-efficient tuning to effectively transfer knowledge from pre-trained models and adapt to new information without catastrophic forgetting.
DualLoRA, a novel continual learning method for pre-trained vision transformers, leverages orthogonal and residual low-rank adaptations with a dynamic memory mechanism to effectively mitigate catastrophic forgetting while maintaining high efficiency.
EXACFS, a novel distillation-based approach, effectively mitigates catastrophic forgetting in class incremental learning by preserving significant features from previous tasks while allowing flexibility for learning new ones, achieving superior stability and plasticity balance compared to existing methods.
VQ-Prompt is a novel method that leverages vector quantization to enable end-to-end training of discrete prompts in vision transformers, effectively mitigating catastrophic forgetting in continual learning scenarios.
본 논문에서는 딥러닝 모델의 지속 학습 과정에서 발생하는 가소성 손실 문제를 해결하기 위해 자기 정규화 리셋(SNR)이라는 새로운 알고리즘을 제안합니다. SNR은 뉴런의 비활성화를 감지하여 해당 뉴런의 가중치를 재설정함으로써 모델의 학습 능력을 유지하고 성능 저하를 방지합니다.