Core Concepts
Different abilities of the model can be cumulatively enhanced by mixing minimal optimal sets of corresponding types of data.
Abstract
Large language models (LLMs) show emergent abilities for math reasoning tasks.
Attention on enhancing open-source LLMs through supervised fine-tuning (SFT).
General data strategy explored to optimize and expand math reasoning ability.
Ability boundary of reasoning paths augmentation determined.
Different abilities of the model enhanced by mixing minimal optimal sets of data.
GSM-HARD dataset challenges numerical robustness.
Auto Problem Generator developed for robustness testing and educational applications.
MMOS data strategy achieves SOTA performance with lower construction costs.
Stats
"Different abilities of the model can be cumulatively enhanced by mixing minimal optimal sets of corresponding types of data."
"GSM-HARD is not really hard and the numerical robustness issue is no longer prevalent in today’s LLMs."
"MMOS can achieve SOTA performance on series base models under much lower construction costs."
Quotes
"Different abilities of the model can be cumulatively enhanced by mixing minimal optimal sets of corresponding types of data."
"GSM-HARD is not really hard and today’s LLMs no longer lack numerical robustness."
"MMOS achieve SOTA performance on series base models under much lower construction costs."