The ASUKA framework proposes a balanced solution to address context-instability and visual inconsistency in image inpainting. By utilizing a Masked Auto-Encoder (MAE) as a prior, ASUKA aligns the MAE with the Stable Diffusion (SD) model to improve context stability. Additionally, an inpainting-specialized decoder is used to enhance visual consistency by mitigating color inconsistencies between masked and unmasked regions. The effectiveness of ASUKA is validated on benchmark datasets Places 2 and MISATO, showcasing superior results compared to state-of-the-art methods.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Yikai Wang,C... klokken arxiv.org 03-19-2024
https://arxiv.org/pdf/2312.04831.pdfDypere Spørsmål