Core Concepts
PCDMs incrementally bridge the gap between source and target poses through three stages, generating high-quality synthesized images.
Abstract
Recent work highlights diffusion models' potential in pose-guided person image synthesis.
Challenges in synthesizing images with distinct poses are addressed by PCDMs through three stages.
The prior model predicts global features, the inpainting model establishes dense correspondences, and the refining model enhances texture and detail consistency.
Quantitative results show PCDMs outperform state-of-the-art methods in SSIM, LPIPS, and FID metrics.
Qualitative comparisons demonstrate PCDMs' ability to generate realistic and detailed person images.
User study results indicate superior performance of PCDMs in perception-oriented tasks.
Ablation study showcases the importance of each stage in improving image quality progressively.
Application in person re-identification tasks shows significant performance improvement over baselines and SOTA methods.
Stats
PCDMs excels in two out of three metrics on DeepFashion compared to other models.
PCDMs outshine all SOTA methods on Market-1501 dataset for SSIM, LPIPS, and FID metrics.