The paper presents CONTHO, a novel method for joint reconstruction of 3D human and object that effectively utilizes human-object contact information.
The key highlights are:
3D-guided contact estimation: CONTHO first reconstructs initial 3D human and object meshes and uses them as explicit 3D guidance to estimate accurate human-object contact maps.
Contact-based refinement: CONTHO proposes a novel contact-based refinement Transformer (CRFormer) that selectively aggregates human and object features based on the estimated contact maps. This prevents learning of undesired human-object correlation and enables accurate 3D reconstruction.
State-of-the-art performance: CONTHO outperforms previous methods in both human-object contact estimation and joint 3D reconstruction of human and object.
The authors first obtain initial 3D human and object meshes using a backbone network. Then, they extract 3D vertex features from the initial meshes and feed them into the ContactFormer to estimate accurate human-object contact maps.
Finally, the CRFormer refines the initial 3D human and object meshes by selectively aggregating human and object features based on the contact maps. This contact-based refinement prevents the network from learning undesired human-object correlation, leading to accurate 3D reconstruction results.
Extensive experiments on BEHAVE and InterCap datasets demonstrate that CONTHO achieves state-of-the-art performance in both human-object contact estimation and joint 3D reconstruction of human and object, outperforming previous methods.
Vers une autre langue
à partir du contenu source
arxiv.org
Questions plus approfondies