Social Masked Autoencoder for Multi-person Motion Representation Learning
The core message of this paper is to introduce Social-MAE, a simple yet effective transformer-based masked autoencoder framework for learning generalizable and data-efficient representations of multi-person human motion data through unsupervised pre-training.