In this work, a novel two-stage learning framework called Teacher-Augmented Policy Gradient (TAPG) is introduced to address the challenges of interactive grasping in cluttered environments. The framework synergizes reinforcement learning and policy distillation to facilitate guided yet adaptive learning of sensorimotor policies based on object segmentation. By training a teacher policy to master motor control based on object pose information, TAPG enables efficient transfer from simulation to real-world scenarios. The study showcases robust zero-shot transfer to novel objects and cluttered environments, demonstrating the effectiveness of the proposed approach. The method integrates pre-trained segmentation models with reinforcement learning, resulting in executable grasping policies that exhibit adaptability and robustness.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Malte Mosbac... at arxiv.org 03-18-2024
https://arxiv.org/pdf/2403.10187.pdfDeeper Inquiries