The paper proposes a new 3D hand mesh reconstruction network called HandSSCA that introduces state space modeling into the field of hand pose estimation for the first time. The key contributions are:
The HandSSCA network uses state space modeling to effectively improve hand reconstruction performance without the need for additional prior knowledge.
A spatial and channel-based parallel scanning approach is proposed, where the state space channel attention (SSCA) module can enhance the effective receptive field range while maintaining a small number of parameters.
The method achieves state-of-the-art performance on the FREIHAND, DEXYCB and HO3D datasets, outperforming recent methods while using significantly fewer parameters.
The paper first provides an overview of the HandSSCA architecture, which consists of a backbone feature extractor, the SSCA module, and a regressor. The SSCA module is the core innovation, using state space modeling to perform spatial and channel-wise scanning to capture both local and global hand features, even under severe occlusion.
Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed method. Compared to prior work, HandSSCA maintains state-of-the-art performance while reducing the number of parameters by up to 5 times. Ablation studies further validate the contributions of the SSCA module in expanding the effective receptive field and enhancing hand feature extraction.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문