This paper proposes a comprehensive benchmark for evaluating video frame interpolation methods. The benchmark includes a carefully designed synthetic test dataset that adheres to the constraint of linear motion, consistent error metrics, and an in-depth analysis of the interpolation quality with respect to various per-pixel attributes such as motion magnitude and occlusion.
VIDIM, a generative model for video interpolation, creates short videos given a start and end frame by using cascaded diffusion models to generate the target video at low resolution and then at high resolution, enabling high-fidelity results even for complex, nonlinear, or ambiguous motions.
The core message of this paper is to introduce a novel perception-oriented video frame interpolation paradigm called PerVFI, which tackles the challenges of blur and ghosting artifacts by incorporating an asymmetric synergistic blending module and a conditional normalizing flow-based generator.
An efficient video frame interpolation framework that achieves state-of-the-art performance with clear improvement while requiring much less computational resources.
The proposed Motion-Aware Latent Diffusion Model (MADIFF) effectively incorporates inter-frame motion priors between the target interpolated frame and the conditional neighboring frames to generate visually smooth and realistic interpolated video frames, significantly outperforming existing approaches.