The paper introduces AdaRefiner, a framework that aims to improve the decision-making capabilities of Reinforcement Learning (RL) agents by integrating Large Language Models (LLMs) with adaptive feedback. The key component of AdaRefiner is the Adapter Language Model, which acts as an intermediary between the RL agent and the Decision LLM (e.g., GPT-4).
The Adapter LM processes environmental information and the agent's comprehension level of the language guidance provided by the Decision LLM. It then generates tailored prompts to refine the Decision LLM's understanding of the specific task and environment, enabling it to provide more relevant and effective guidance to the RL agent.
The training process of AdaRefiner involves a feedback loop, where the RL agent's actions and trajectories are used to update the Adapter LM's comprehension of the environment. This allows the Adapter LM to continuously refine its understanding and generate more appropriate prompts for the Decision LLM, leading to improved decision-making by the RL agent.
The authors evaluate AdaRefiner on 22 diverse tasks within the Crafter environment, a benchmark for open-world games. The results demonstrate that AdaRefiner outperforms state-of-the-art baselines, including LLM-based methods and RL algorithms, in terms of overall performance, success rates, and the depth of achievements completed by the agents. The authors also conduct ablation studies to highlight the importance of the Adapter LM and the adaptive feedback from the RL agent.
Furthermore, the paper provides insights into the guidance provided by AdaRefiner and the common-sense behaviors exhibited by the agents, showcasing the framework's ability to steer agents towards higher-level and more coherent decision-making.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문