Generating High-Quality and Diverse Two-Person Interaction Motions with Text Guidance
Our approach, InterGen, enables the generation of high-quality and diverse two-person interaction motions from text prompts by introducing a novel multimodal dataset, cooperative denoising networks, and effective spatial relation modeling.