แนวคิดหลัก
The core message of this paper is to introduce DPMesh, an innovative framework that fully exploits the rich knowledge about object structure and spatial interaction within a pre-trained diffusion model to achieve accurate occluded human mesh recovery in a single step.
บทคัดย่อ
The paper presents DPMesh, a framework for occluded human mesh recovery that leverages the pre-trained diffusion model's knowledge about object structure and spatial relationships.
Key highlights:
- Conventional methods rely on convolutional or transformer-based backbones, which struggle to extract effective features under severe occlusion.
- DPMesh employs the pre-trained denoising U-Net from a text-to-image diffusion model as the backbone, seamlessly integrating its potent knowledge for the mesh recovery task.
- The framework incorporates well-designed guidance via condition injection, which produces effective controls from 2D observations for the denoising U-Net.
- A dedicated noisy key-point reasoning approach is explored to mitigate disturbances arising from occlusion and crowded scenarios.
- Extensive experiments on various occlusion benchmarks demonstrate the superior performance of DPMesh, outperforming state-of-the-art methods.
- DPMesh achieves MPJPE values of 70.9, 82.2, 79.9, and 73.6 on 3DPW-OC, 3DPW-PC, 3DPW-Crowd, and 3DPW test split, respectively.
สถิติ
"We achieve MPJPE values of 70.9, 82.2, 79.9, and 73.6 on 3DPW-OC, 3DPW-PC, 3DPW-Crowd, and 3DPW test split, respectively."
"Remarkably, without any finetuning on the 3DPW training set, our DPMesh achieves an exciting performance, surpassing previous state-of-the-art methods and demonstrating significantly improved accuracy."
คำพูด
"To overcome the aforementioned challenges, we present DPMesh, a simple yet effective framework for occluded human mesh recovery."
"Our primary goal is to harness both the high-level and low-level visual concepts within a pre-trained diffusion model for the demanding occluded pose estimation task."
"Extensive experiments on various occlusion benchmarks affirm the efficacy of our framework, as we outperform state-of-the-art methods on both occlusion-specific and standard datasets."