This paper explores enhancing the agent capabilities of 7B and 13B open-source language models. The authors propose two key approaches:
Supervised Fine-Tuning (SFT): The authors construct agent-specific data using GPT-4 to capture the interactive behaviors between the agent and the environment. By fine-tuning the low-parameter language models on this data along with general instruction tuning data, they are able to significantly reduce hallucination outputs and formatting errors in agent tasks.
Multi-Branch Reasoning: For complex agent tasks, the authors find that a single reasoning path may not yield the optimal answer. They introduce techniques like task decomposition and backtracking to reduce the problem complexity and enhance the performance of the language models. Task decomposition breaks down complex tasks into smaller subtasks, while backtracking allows the model to revisit previous reasoning steps and explore alternative paths.
The authors evaluate their methods on five agent tasks from the AgentBench benchmark and achieve promising results, outperforming the baseline approaches. They demonstrate that the combination of SFT and multi-branch reasoning can effectively improve the agent capabilities of low-parameter language models.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询