This paper proposes a comprehensive benchmark for evaluating video frame interpolation methods. The benchmark includes a carefully designed synthetic test dataset that adheres to the constraint of linear motion, consistent error metrics, and an in-depth analysis of the interpolation quality with respect to various per-pixel attributes such as motion magnitude and occlusion.
VIDIM, a generative model for video interpolation, creates short videos given a start and end frame by using cascaded diffusion models to generate the target video at low resolution and then at high resolution, enabling high-fidelity results even for complex, nonlinear, or ambiguous motions.
The core message of this paper is to introduce a novel perception-oriented video frame interpolation paradigm called PerVFI, which tackles the challenges of blur and ghosting artifacts by incorporating an asymmetric synergistic blending module and a conditional normalizing flow-based generator.
An efficient video frame interpolation framework that achieves state-of-the-art performance with clear improvement while requiring much less computational resources.
The proposed Motion-Aware Latent Diffusion Model (MADIFF) effectively incorporates inter-frame motion priors between the target interpolated frame and the conditional neighboring frames to generate visually smooth and realistic interpolated video frames, significantly outperforming existing approaches.
VFIMamba leverages the strengths of Selective State Space Models (S6), particularly their efficiency and global receptive field, to achieve state-of-the-art performance in video frame interpolation, especially for high-resolution videos.