핵심 개념
A3T proposes a framework for autonomous annotation of agent trajectories in the style of ReAct, enabling contrastive self-training for language agents.
통계
"In AlfWorld, the agent trained with A3T obtains a 1-shot success rate of 96%, and 100% success with 4 iterative rounds."
"On WebShop, the 1-shot performance of the A3T agent matches human average, and 4 rounds of iterative refinement lead to the performance approaching human experts."
인용구
"We propose A3T, a framework that enables the Autonomous Annotation of Agent Trajectories in the style of ReAct."
"A3T paves the way for agents with improved autonomy through the closed loop of self-annotation and contrastive self-training."