The ASUKA framework proposes a balanced solution to address context-instability and visual inconsistency in image inpainting. By utilizing a Masked Auto-Encoder (MAE) as a prior, ASUKA aligns the MAE with the Stable Diffusion (SD) model to improve context stability. Additionally, an inpainting-specialized decoder is used to enhance visual consistency by mitigating color inconsistencies between masked and unmasked regions. The effectiveness of ASUKA is validated on benchmark datasets Places 2 and MISATO, showcasing superior results compared to state-of-the-art methods.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Yikai Wang,C... alle arxiv.org 03-19-2024
https://arxiv.org/pdf/2312.04831.pdfDomande più approfondite