Core Concepts
Agents need interactive physical reasoning capabilities for real-time interventions in dynamic environments.
Abstract
The content introduces I-PHYRE, a framework challenging agents with interactive physical reasoning. It emphasizes intuitive physical reasoning, multi-step planning, and in-situ intervention. The games are split into basic, noisy, compositional, and multi-ball categories to test generalization abilities. Human baseline performance is compared to various RL agents using different planning strategies.
Introduction
Current evaluation protocols lack assessing agents' abilities to interact with dynamic events.
I-PHYRE challenges agents with intuitive physical reasoning and multi-step planning.
Game Design
40 distinctive interactive physics games categorized into four splits based on algorithmic stability.
Planning Strategies
Planning in advance, on-the-fly planning, and combined strategy explored for interactive physical reasoning problems.
Experiments
RL agents' performance on zero-shot generalization across different splits analyzed.
Discussion
Disparity between RL agents and humans in performance highlighted.
Conclusion
I-PHYRE aims to assess learning methods for interacting with the physical world effectively.
Stats
The outcomes highlight a notable gap between existing learning algorithms and human performance.
Participants exhibit a success rate above 80%, demonstrating robust problem-solving abilities.
Quotes
"Prevailing studies exhibit notable limitations in exploring physical reasoning due to constraints."
"Current RL agents manifest substantial gaps in generalization compared to humans."