Core Concepts
Introducing a novel two-stage approach to human image generation, enhancing hand quality and pose control in diffusion models.
Abstract
Recent advancements in diffusion models have led to significant progress in human image generation. However, challenges persist in producing high-quality hand anatomy and precise control over hand poses. This article introduces a novel two-stage approach that divides the process into hand generation and body outpainting stages. By training the hand generator in a multi-task setting to produce segmentation masks along with hand images, followed by using an adapted ControlNet model for outpainting, the proposed method demonstrates superior performance over existing techniques. The blending technique ensures seamless synthesis of the final image by fusing results from both stages coherently.
Stats
Pose accuracy improved by 30.5%
Hand DAP increased by 92.3%
MPJPE reduced by 50% for full body and 40% for hands
Quotes
"Our approach not only enhances the quality of the generated hands but also offers improved control over hand pose."
"Experimental evaluations demonstrate the superiority of our proposed method over state-of-the-art techniques."