Conversational Motion Controllers: Generating Continuous Human Motions through Multimodal Prompts
MotionChain is a unified vision-motion-language generative model that can generate continuous human motions through multi-modal prompts, including text, image, and motion, in a step-by-step conversational manner.