The paper introduces a collaborative blocks world environment (COBLOCK) to evaluate the collaboration abilities of language models (LLMs). In COBLOCK, two agents (either human or LLM) with complementary goals and skills work together to build a target structure.
The key highlights and insights are:
The COBLOCK environment has three types of collaboration tasks with increasing levels of interdependence between the agents' goals and skills: independent tasks, skill-dependent tasks, and goal-dependent tasks.
To guide the LLM agents in COBLOCK, the authors propose a chain-of-thought (CoT) prompting approach that includes three key steps:
Experiments show that the baseline LLM agents struggle in the skill-dependent and goal-dependent tasks due to issues like prioritizing partner's goals over their own. However, the proposed approach with partner-state modeling and self-reflection significantly improves the collaboration performance, leading to higher task success rates, better workload balance, and fewer completion timesteps.
While human-machine collaboration has slightly higher success rates than machine-machine collaboration, humans often take on more responsibility when the LLM agent struggles, especially in the more challenging goal-dependent tasks.
The findings and the COBLOCK environment provide valuable insights and resources for future research on communication, coordination, and collaboration in multi-agent settings.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Guande Wu,Ch... at arxiv.org 04-02-2024
https://arxiv.org/pdf/2404.00246.pdfDeeper Inquiries