Core Concepts
Gaze plays a crucial role in predicting human motion, leading to the development of a novel dataset and methodology for synthesizing gaze-guided hand-object interactions.
Abstract
The content introduces the GazeHOI dataset, focusing on gaze-guided hand-object interactions. It presents a hierarchical framework named GHO-Diffusion for synthesizing these interactions. The methodology involves spatial-temporal feature encoding, goal pose generation, and diffusion models for object and hand motions. Extensive experiments validate the effectiveness of the dataset and approach.
Introduction
Gaze's significance in revealing human intent.
Relationship between gaze, attention, and activities explored.
Dataset Creation
Introduction of GazeHOI dataset capturing 3D modeling of gaze, hand, and object interactions.
Features 479 sequences with various tasks involving 33 objects.
Methodology
Hierarchical framework GHO-Diffusion introduced for synthesis.
Pre-diffusion phase separates gaze conditions into features and goal poses.
Diffusion phase generates object motions based on gaze conditions.
Experiments
Data split into training and test sets for evaluation metrics.
Baselines compared with proposed method showing superior results.
Ablation Study
Impact of different gaze encoding methods and guidance strategies evaluated.
Stats
ガイド付きの手物体相互作用合成に関する新しいデータセットと方法論を紹介します。