Efficient Text-to-Video Generation with Grid Diffusion Models
Our novel grid diffusion models efficiently generate high-quality videos from text by reducing the temporal dimension of videos to the image dimension, enabling the use of various image-based methods for video tasks.