Efficient Transformer for Accurate Monocular 3D Human Shape and Pose Estimation
The proposed SMPLer Transformer framework can effectively exploit high-resolution image features to achieve accurate 3D human shape and pose estimation by introducing an efficient decoupled attention mechanism and a compact SMPL-based target representation.