UniCtrl introduces a novel method to enhance spatiotemporal consistency and motion diversity in videos generated by text-to-video models without additional training.
UniCtrl introduces a novel method to enhance spatiotemporal consistency and motion diversity in videos generated by text-to-video models without additional training. The approach ensures semantic consistency across frames through cross-frame self-attention control, improving overall video quality.