The ASUKA framework proposes a balanced solution to address context-instability and visual inconsistency in image inpainting. By utilizing a Masked Auto-Encoder (MAE) as a prior, ASUKA aligns the MAE with the Stable Diffusion (SD) model to improve context stability. Additionally, an inpainting-specialized decoder is used to enhance visual consistency by mitigating color inconsistencies between masked and unmasked regions. The effectiveness of ASUKA is validated on benchmark datasets Places 2 and MISATO, showcasing superior results compared to state-of-the-art methods.
เป็นภาษาอื่น
จากเนื้อหาต้นฉบับ
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by Yikai Wang,C... ที่ arxiv.org 03-19-2024
https://arxiv.org/pdf/2312.04831.pdfสอบถามเพิ่มเติม