Core Concepts
StreamingT2V enables seamless long video generation from text with high motion dynamics and consistency.
Abstract
StreamingT2V introduces an autoregressive technique for generating long videos with smooth transitions and high motion dynamics.
The method utilizes short-term memory (CAM) and long-term memory (APM) blocks to ensure temporal consistency and preserve scene features.
A randomized blending approach is used for enhancing video quality without inconsistencies between chunks.
Experiments show that StreamingT2V outperforms competitors in motion amount and temporal consistency.
The method allows for the creation of extended videos from text instructions without stagnation or inconsistencies.
Stats
StreamingT2Vは、長い動画をテキストから生成し、滑らかな遷移と高い動きのダイナミクスを実現します。