HELPER-X: A Unified Instructable Embodied Agent for Interactive Vision-Language Tasks
HELPER-X, a unified instructable embodied agent, demonstrates state-of-the-art performance across four diverse interactive vision-language domains - dialogue-based task completion, natural language instruction following, active question asking, and room tidying - by expanding the memory-augmented prompting capabilities of the HELPER agent.