핵심 개념
AS-ES learning maximizes small models' potential in CoT-intensive tasks by segmenting CoT data for iterative generation.
초록
The article introduces AS-ES learning for efficient CoT learning in small models. It proposes a new training paradigm that utilizes existing CoT data without additional data. The article explores the reasons behind the inefficiency of small models in learning CoT and provides insights into the underlying mechanism of CoT. It discusses the segmentation of CoT data into extractive and abstractive segments and the construction of an AS-ES dataset for iterative learning. The article also experiments with different training strategies and model sizes, highlighting the effectiveness of AS-ES learning across various tasks. The results show improvements in model performance and provide insights into the training process and hyperparameter settings.
Introduction
CoT is crucial for logical reasoning in LLMs.
Attempts to induce CoT ability in small models.
Proposed AS-ES learning for iterative generation.
Methodology
AS-ES Segmentation: Extractive and abstractive segments.
AS-ES Dataset Construction: Tailored dataset for iterative learning.
AS-ES Learning: Dual-path and uni-path learning.
Experiment
Dataset: MWP and PET tasks.
Implementation: Base models and training process.
Results: AS-ES learning improves model performance.
Hyperparameters: Impact on training strategies.
Discussion
Effect of Segmentation: Different strategies impact AS-ES learning.
Effect of Hyperparameters: β and γ affect model performance.
Why AS-ES Learning Works: Lower loss boundary compared to direct approach.
통계
"AS-ES learning improves the model performance on both tasks."
"Entropy-oriented segmentation shows generalizability across different model sizes and tasks."
"AS-ES learning works by achieving a generally lower loss boundary compared to the direct approach."
인용구
"AS-ES learning maximizes the latent potential of small models for CoT-intensive tasks."
"The limitations of small models in CoT learning stem from the training paradigm instead of their inherent capacity."