The authors introduce a novel dynamic, interactive RL testbed based on the game of air hockey. The testbed offers several advantages that facilitate RL training, such as a constrained puck movement, a strictly controlled agent workspace, and the incorporation of multiple objects. The testbed includes two simulators of increasing fidelity to the real world, as well as a real-world setup with a UR5 robot arm.
The testbed provides a collection of ten tasks that vary in difficulty, ranging from simple reaching to more challenging tasks like juggling the puck or hitting a puck into a goal region with a desired velocity. The authors evaluate three representative RL methods - behavior cloning, vanilla RL, and offline RL - on these tasks in simulation and the real world.
The results show that online RL performs the best among the baselines in simulation, while in the real world, all the baselines fall short of human performance, leaving room for potential improvements. The authors discuss future work, including exploring goal-conditioned RL, sim-to-real transfer, and unsupervised skill learning in the air hockey testbed.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問