The study introduces LM2D, a novel probabilistic architecture for dance synthesis conditioned on both music and lyrics. It addresses the limitations of existing models by incorporating a multimodal diffusion model with consistency distillation. The research includes the first 3D dance-motion dataset encompassing music and lyrics. Objective metrics and human evaluations demonstrate LM2D's ability to produce realistic dances matching both lyrics and music. The study explores the impact of lyrics in choreography, emphasizing the need for efficient single-step generation methods.
Til et andet sprog
fra kildeindhold
arxiv.org
Vigtigste indsigter udtrukket fra
by Wenj... kl. arxiv.org 03-15-2024
https://arxiv.org/pdf/2403.09407.pdfDybere Forespørgsler