The paper proposes the task of Hand-Object Stable Grasp Reconstruction (HO-SGR), which focuses on reconstructing hands and objects during temporal segments of stable grasps. The authors first define the stable grasp based on the intuition that the in-contact area between the hand and object should remain stable. By analyzing the 3D ARCTIC dataset, they identify stable grasp durations and showcase that objects in stable grasps move within a single degree of freedom (1-DoF) relative to the hand pose.
The authors then propose a method to jointly optimize all frames within a stable grasp, minimizing the object's motion to a latent 1-DoF rotation axis. This is in contrast to previous methods that optimize each frame independently or assume free 6-DoF object motion.
The authors also introduce the EPIC-Grasps dataset, which contains 2,431 video clips of stable grasps from 141 distinct videos in 31 kitchens, with 2D segmentation masks for the hand and object. This dataset is the first to capture in-the-wild egocentric videos of functional hand-object interactions.
The authors evaluate their proposed 1-DoF optimization method on both the ARCTIC-Grasps and EPIC-Grasps datasets. On ARCTIC-Grasps, the 1-DoF method outperforms baselines in terms of both 3D reconstruction accuracy and stable contact area. On EPIC-Grasps, the 1-DoF method achieves the best stable contact area metrics compared to other baselines, demonstrating the importance of the constrained object motion assumption in the in-the-wild setting.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Zhifan Zhu,D... at arxiv.org 04-09-2024
https://arxiv.org/pdf/2312.15719.pdfDeeper Inquiries