核心概念
MagicPose is a diffusion-based model that can generate realistic human images with controlled poses and facial expressions while preserving the identity of the reference person.
要約
The paper proposes MagicPose, a novel approach for realistic human pose and facial expression retargeting. The key idea is to decompose the problem into two tasks: (1) identity/appearance control and (2) pose/motion control.
For appearance control, MagicPose introduces an Appearance Control Model that provides appearance guidance from a reference image to the Stable Diffusion (SD) model via a Multi-Source Attention Module. For pose control, MagicPose uses a Pose ControlNet to provide pose and expression guidance.
MagicPose employs a multi-stage training strategy to effectively learn these sub-modules and disentangle the appearance and pose control. Extensive experiments demonstrate MagicPose's ability to retain key features of the reference identities, including skin tone and clothing, while following the pose skeleton and facial landmark inputs. Moreover, MagicPose can generalize well to unseen identities and motions without any fine-tuning.
The paper makes the following key contributions:
- An effective method (MagicPose) for human pose and expression retargeting as a plug-in for Stable Diffusion.
- Multi-Source Attention Module that offers detailed appearance guidance.
- A two-stage training strategy that enables appearance-pose-disentangled generation.
- Demonstration of strong generalizability of the model to diverse image styles and human poses.
- Comprehensive experiments on the TikTok dataset showing superior performance in pose retargeting.
統計
MagicPose achieves a Face-Cos score of ~0.426, representing a substantial +0.260 enhancement over the previous state-of-the-art method Disco.
MagicPose outperforms previous methods like FOMM, MRAA, TPS, and Disco across various metrics such as FID, SSIM, PSNR, LPIPS, and L1.
引用
"MagicPose can provide zero-shot and realistic human poses and facial expressions retargeting for human images of different styles and poses."
"Our novel design enables robust appearance control over generated human images, including body, facial attributes, and even background."
"MagicPose generalizes well to unseen human identities and complex poses without the need for additional fine-tuning."