Core Concepts
The proposed ShoeModel system can generate hyper-realistic advertising images of user-specified shoes worn by human models, while preserving the identity of the shoes and producing plausible interactions between the shoes and the human legs.
Abstract
The paper introduces the ShoeModel system, which aims to generate hyper-realistic advertising images of user-specified shoes worn by human models. The system consists of three key modules:
Wearable-area Detection (WD) Module: This module detects the visible and wearable areas of the input shoe image, allowing the system to avoid occlusion issues when generating the final image.
Leg-pose Synthesis (LpS) Module: This module generates diverse and plausible leg poses that align with the given shoe image, providing reasonable pose constraints for the subsequent human body generation.
Shoe-wearing (SW) Module: This module combines the processed shoe image and the synthesized leg pose to generate the final hyper-realistic advertising image, while ensuring the identity of the input shoes is maintained.
The authors also introduce a custom shoe-wearing dataset to support the training of the proposed system. Extensive experiments demonstrate the effectiveness of ShoeModel in generating high-quality, realistic images that preserve the identity of the user-specified shoes and exhibit reasonable interactions between the shoes and the human models, outperforming various baseline methods.
Stats
The paper does not provide any specific numerical data or statistics. The focus is on the system design and the qualitative evaluation of the generated images.
Quotes
The paper does not contain any direct quotes that are particularly striking or support the key logics.