Core Concepts
The author proposes the Self-Synthesized Rehearsal (SSR) framework to address catastrophic forgetting in large language models by generating synthetic instances for rehearsal, achieving superior performance compared to conventional methods.
Abstract
The content discusses mitigating catastrophic forgetting in large language models through the SSR framework. SSR generates synthetic instances for rehearsal, outperforming conventional methods and preserving generalization capabilities.
Large language models (LLMs) face catastrophic forgetting during continual learning.
Conventional rehearsal-based methods rely on previous training data.
Proposed SSR framework uses LLM to generate synthetic instances for rehearsal.
SSR achieves superior performance and data efficiency compared to conventional approaches.
Experiments show SSR preserves generalization capabilities of LLMs in various domains.
In continual learning, LLMs update sequentially with instruction data for each stage.
Rehearsal-based methods sample training instances from previous stages to expand current training data.
SSR generates synthetic instances using base LLM and refines outputs with the latest LLM.
Selected high-quality synthetic instances are used for future rehearsals.
Regularization-based, architecture-based, and rehearsal-based are main approaches to continual learning.
Rehearsal-based methods store a subset of data from previous tasks for future rehearsal.
Prior approaches focus on using precedent data but may not be practical in real-world applications.
SSR demonstrates superior or comparable performance compared to conventional baselines.
Experiments on SuperNI dataset show SSR's effectiveness in mitigating catastrophic forgetting.
Stats
Large language models suffer from catastrophic forgetting during continual learning.
Synthetic instances generated by SSR achieve superior or comparable performance compared to conventional methods.