Consistent Self-Attention for Generating Subject-Coherent Images and Videos from Text Prompts
StoryDiffusion generates consistent images and videos from text prompts by incorporating Consistent Self-Attention and Semantic Motion Prediction, enabling the creation of coherent visual narratives.