核心概念
Agents need to exhibit intuitive physical reasoning, multi-step planning, and in-situ intervention to succeed in the I-PHYRE framework.
要約
The I-PHYRE framework challenges agents to demonstrate intuitive physical reasoning, multi-step planning, and in-situ intervention. It addresses the gap in evaluating agents' abilities to interact with dynamic events. The framework consists of four game splits designed to scrutinize learning and generalization of essential principles of interactive physical reasoning. Existing works have limitations in exploring physical reasoning due to constraints like passive observation or single-round interventions. I-PHYRE aims to bridge these gaps by emphasizing intuitive physical reasoning, multi-step interventions, and in-situ interactions. The framework includes 40 distinct games categorized into basic, noisy, compositional, and multi-ball splits for training and generalization assessment.
Introduction:
- Current evaluation protocols focus on stationary scenes.
- I-PHYRE introduces interactive physical reasoning.
- Challenges agents with intuitive physical reasoning, multi-step planning, and in-situ intervention.
Game Design:
- Consists of 40 distinctive interactive physics games.
- Categorized into basic, noisy, compositional, and multi-ball splits.
- Unified objective is guiding red balls into the hole by eliminating gray blocks strategically.
Experiments:
- Assess capabilities of learning agents on different splits.
- Humans outperform RL agents on I-PHYRE tasks.
- Planning strategies impact agent performance significantly.
統計
1https://www.youtube.com/watch?v=O4_848IPFVw
引用
Players must meticulously control the launcher... - "Consider the dynamics of playing a game of 3D pinball..."
Contemporary benchmarks have emerged... - "The profound physical reasoning aptitude observed in humans..."