核心概念
The author introduces StereoDiffusion, a training-free method for generating stereo image pairs using latent diffusion models. By modifying the latent variable and implementing innovative techniques, high-quality stereo images can be rapidly generated without the need for model fine-tuning.
摘要
StereoDiffusion presents a novel approach to generating stereo image pairs without training, seamlessly integrating into Stable Diffusion models. The method involves modifying the latent variable, applying Stereo Pixel Shift operations, Symmetric Pixel Shift Masking Denoise, and Self-Attention Layers Modification to ensure consistency between left and right images. This technique achieves state-of-the-art scores in quantitative evaluations on various datasets like Middlebury and KITTI. The proposed method offers a lightweight solution for fast and high-quality stereo image generation.
統計資料
Our method achieved better scores on both the KITTI and Middlebury datasets.
The reference scores for the KITTI dataset are lower compared to those of the Middlebury dataset.
The results of user tests showed that our method has the highest average but did not significantly outperform others.
Deblur has a certain negative impact on LPIPS and SSIM scores on Middlebury dataset.