This paper introduces HIVE, a framework that leverages human feedback to enhance instructional visual editing. The framework collects human feedback on edited images to capture user preferences and uses scalable diffusion model fine-tuning methods to incorporate this feedback. Extensive experiments show that HIVE outperforms previous state-of-the-art models by a large margin. The paper also discusses the challenges, methodology, experiments, ablation studies, and limitations of the approach.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Shu Zhang,Xi... at arxiv.org 03-28-2024
https://arxiv.org/pdf/2303.09618.pdfDeeper Inquiries