Core Concepts
Universal Manipulation Interface (UMI) enables direct skill transfer from in-the-wild human demonstrations to deployable robot policies, unlocking new robot manipulation capabilities.
Abstract
Universal Manipulation Interface (UMI) is a portable, intuitive, low-cost data collection and policy learning framework that allows for the transfer of diverse human demonstrations to effective visuomotor policies. UMI addresses critical issues such as insufficient visual context, action imprecision, latency discrepancies, and insufficient policy representation in previous works. By carefully designing the demonstration and policy interface, UMI provides a practical and accessible framework for teaching robots complex manipulation skills. The system captures rapid movements with absolute scale using IMU-aware tracking and continuous gripper control. UMI also incorporates kinematic-based data filtering to ensure valid trajectories for different robot embodiments. Through experiments on tasks like cup arrangement, dynamic tossing, bimanual cloth folding, and dish washing, UMI demonstrates its capability to handle various manipulation challenges with high success rates. Additionally, UMI showcases strong generalization capabilities to novel environments and objects through in-the-wild data collection. The system offers improved data collection throughput compared to traditional teleoperation methods while maintaining high accuracy in SLAM-based tracking.
Stats
UMI achieves 87.5% success rate in dynamic tossing task.
The system achieves 70% success rate in bimanual cloth folding task.
UMI demonstrates 70% success rate in dish washing task.
Quotes
"UMI eliminates the need for physical robots during data collection and offers a more portable interface for in-the-wild robot teaching."
"We demonstrate UMI’s versatility and efficacy with comprehensive real-world experiments."
"UMI's hardware and software system is open-sourced at https://umi-gripper.github.io."