Large-scale increases in instruction data can lead to world knowledge forgetting in LLMs, but LoRAMoE mitigates this issue while enhancing multitasking abilities.
Large-scale instruction data can damage world knowledge in LLMs, LoRAMoE mitigates this issue while enhancing downstream task performance.
LoRAMoE introduces a novel framework to address the conflict between improving LLM performance on downstream tasks and preventing world knowledge forgetting during SFT.