A new Spatial-Temporal Part-aware Network (StepNet) that effectively captures the fine-grained spatial and temporal cues in sign language videos without using any keypoint-level annotations.
Efficiently capturing spatial interactions over time in sign language recognition using TCNet.
Proposing a novel motor attention mechanism and applying self-distillation to improve continuous sign language recognition.
The author proposes a novel motor attention mechanism to capture dynamic changes in sign language expressions, enhancing recognition accuracy. Additionally, the self-distillation method is applied to improve feature expression without increasing computational resources.