The ASUKA framework proposes a balanced solution to address context-instability and visual inconsistency in image inpainting. By utilizing a Masked Auto-Encoder (MAE) as a prior, ASUKA aligns the MAE with the Stable Diffusion (SD) model to improve context stability. Additionally, an inpainting-specialized decoder is used to enhance visual consistency by mitigating color inconsistencies between masked and unmasked regions. The effectiveness of ASUKA is validated on benchmark datasets Places 2 and MISATO, showcasing superior results compared to state-of-the-art methods.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Yikai Wang,C... lúc arxiv.org 03-19-2024
https://arxiv.org/pdf/2312.04831.pdfYêu cầu sâu hơn