Core Concepts
Introducing D2A-HMR, a transformer architecture for precise human mesh recovery by incorporating scene-depth information and distribution modeling.
Abstract
The article introduces Distribution and Depth-Aware Human Mesh Recovery (D2A-HMR), an end-to-end transformer architecture designed to address depth ambiguities and distribution disparities in monocular human mesh recovery. Existing methods struggle with challenges like appearance domain gap and depth ambiguity, especially when applied to in-the-wild data. The proposed D2A-HMR framework integrates scene-depth information from monocular cameras to refine the model's representation. By leveraging normalizing flows, the model minimizes distribution disparities between predicted and ground truth meshes. The architecture includes a silhouette decoder, masked modeling module, and a refinement module to enhance the model's capabilities. Extensive experiments demonstrate the competitive performance of D2A-HMR against state-of-the-art techniques on benchmark datasets like 3DPW and Human3.6M.
Stats
ArXiv:2403.09063v1 [cs.CV] 14 Mar 2024
Quotes
"Our approach demonstrates superior performance in handling OOD data in certain scenarios while consistently achieving competitive results against state-of-the-art HMR methods on controlled datasets."
"To address the limitations of existing methods, our work introduces a novel approach to address these issues through a depth- and distribution-aware framework designed for the recovery of human mesh from monocular images."