Ziya2, a 13-billion-parameter language model, is developed through a data-centric approach that focuses on optimizing the use of pre-training data to enhance the model's capabilities in Chinese, mathematics, and programming tasks, while maintaining or improving its performance on general English benchmarks.
The LLM-ADE framework introduces a novel approach to continual pre-training of large language models, enabling efficient integration of new datasets while mitigating catastrophic forgetting and double descent.