OMG introduces a novel framework for compelling motion generation from zero-shot open-vocabulary text prompts.
The proposed Bidirectional Autoregressive Diffusion (BAD) framework unifies the strengths of autoregressive and mask-based generative models to effectively capture both sequential and bidirectional relationships in text-guided human motion generation.
A two-stage method that learns expressive text-to-motion generation from partially annotated data, utilizing VQ-VAE experts for high-quality motion representation and a multi-indexing GPT model for coordinating body, hand, and facial motions.
A novel spatial-temporal modeling framework for generating human motions from textual prompts, which quantizes each joint into a 2D token map and leverages 2D operations to effectively capture the spatial-temporal relationships.
MotionRL is a novel approach that leverages reinforcement learning to fine-tune text-to-motion generation models, aligning them with human preferences and improving the quality of generated motions beyond traditional metrics.
인간의 움직임에 대한 인식을 기존의 수치적 지표보다 우선시하여 텍스트에서 모션을 생성하는 모델 MotionRL을 소개합니다. MotionRL은 강화 학습을 사용하여 인간의 선호도를 학습하고, 텍스트 충실도, 모션 품질 및 인간 선호도 간의 균형을 맞춰 최적의 모션을 생성합니다.
ReinDiffuse enhances the physical plausibility of human motion generated from text descriptions by integrating reinforcement learning with motion diffusion models, eliminating the need for computationally expensive physics simulations.
텍스트에서 사실적인 인간 동작을 생성하기 위해 강화 학습과 모션 확산 모델을 결합한 새로운 접근 방식인 ReinDiffuse를 소개합니다.
This research introduces a novel method to enhance the variation and realism of motions generated by text-to-motion models by incorporating pose and video editing techniques, addressing the limitations posed by data scarcity in current text-motion datasets.
LEAD, a novel text-to-motion generation model, leverages latent diffusion and a realignment mechanism to create semantically structured motion latents, improving realism, expressiveness, and enabling textual motion inversion for personalized motion synthesis.