Core Concepts
The author explores a data strategy to optimize and expand math reasoning ability in Large Language Models (LLMs) through supervised fine-tuning and data augmentation.
Abstract
An empirical study delves into the ability boundary of reasoning paths augmentation, identifying minimal optimal sets, enhancing weak abilities, addressing numerical robustness, and expanding existing abilities. The study showcases the effectiveness of Mix of Minimal Optimal Sets (MMOS) in achieving state-of-the-art performance with lower construction costs.
Stats
GSM8K Train: 7473 Test: 1319
MATH Train: 7500 Test: 5000
Quotes
"Providing varied, deduplicated, and correct reasoning paths can improve math reasoning ability."
"Different abilities of the model can be cumulatively enhanced by mixing minimal optimal sets of corresponding types of data."