Core Concepts
The proposed Hybrid Dual-Branch Network (HDBN) effectively combines Graph Convolutional Networks (GCNs) and Transformers to achieve robust and accurate skeleton-based action recognition.
Abstract
The paper presents a novel Hybrid Dual-Branch Network (HDBN) for robust skeleton-based action recognition. The key highlights are:
The HDBN consists of two trunk branches: MixGCN and MixFormer.
The MixGCN branch utilizes GCNs to separately process 2D and 3D skeleton inputs, and employs a late-fusion strategy to aggregate the classification results.
The MixFormer branch employs Transformers to model the skeleton inputs, harnessing Transformers' abstraction capability for global information.
By leveraging the complementarity between GCNs and Transformers, the proposed HDBN effectively integrates the strengths of both network structures to achieve better human action recognition.
Extensive experiments on the benchmark UAV-Human dataset demonstrate the effectiveness of the proposed HDBN, outperforming most existing action recognition methods.
The authors conduct detailed ablation studies to analyze the performance of different skeleton modalities and network backbones within the HDBN framework.
Overall, the HDBN provides a robust and effective solution for skeleton-based action recognition by harnessing the complementary advantages of GCNs and Transformers.
Stats
The accuracy of the proposed HDBN on the UAV-Human dataset is 47.95% on the CSv1 benchmark and 75.36% on the CSv2 benchmark, outperforming most existing methods.
Quotes
"By leveraging the proposed HDBN, we effectively integrate GCNs and TransFormers to achieve better human action recognition."
"Our proposed HDBN emerged as one of the top solutions in the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) of 2024 ICME Grand Challenge, achieving accuracies of 47.95% and 75.36% on two benchmarks of the UAV-Human dataset by outperforming most existing methods."