核心概念
Learning from mistakes can effectively improve the chain-of-thought reasoning capabilities of large language models across various mathematical and commonsense reasoning tasks.
要約
The content discusses a novel approach called "LEarning from MistAkes" (LEMA) to further improve the chain-of-thought (CoT) reasoning capabilities of large language models (LLMs) for solving mathematical and commonsense reasoning tasks.
Key highlights:
- LEMA incorporates mistake-correction data pairs during fine-tuning LLMs, mimicking the error-driven learning process of human students.
- The mistake-correction data is generated by first collecting inaccurate reasoning paths from various LLMs, and then using GPT-4 as a "corrector" to identify the mistake step, explain the reason for the mistake, and provide the corrected solution.
- A correction-centric evolution strategy is applied to effectively expand the question set for generating more diverse correction data.
- Experiments on five open-source LLMs and five challenging reasoning tasks demonstrate that LEMA consistently outperforms fine-tuning on CoT data alone.
- Ablation studies reveal the non-homogeneous effectiveness of CoT data and correction data, and show that the correction-centric evolution strategy is more beneficial than random question selection.
- LEMA can also enhance the performance of specialized LLMs like WizardMath and MetaMath, and improve commonsense reasoning capabilities of LLaMA-2-70B on CSQA.
統計
Step 1: Tina makes $18.00 an hour for 8 hours, which is 8 * $18.00 = $144.00.
Step 2: She makes $27.00 an hour for the 2 hours of overtime, which is 2 * $27.00 = $54.00.
Step 3: For one day, she makes $144.00 + $54.00 = $198.00.
Step 4: For 5 days, she makes $198.00 * 5 = $990.00.
引用
"Mistakes are the portals of discovery."
James Joyce