SkelVIT introduces a novel approach to skeleton-based action recognition by combining pseudo-image representation with vision transformers. The study compares SkelVIT with state-of-the-art methods, demonstrating superior performance. Additionally, the research delves into the sensitivity of VITs compared to CNN models and explores the impact of ensemble classifiers on recognition accuracy.
The content discusses the significance of different representation schemes in action recognition and evaluates the effectiveness of VITs in improving classification results. Through detailed experiments and comparisons, SkelVIT emerges as a promising solution for efficient and accurate skeleton-based action recognition.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Ozge Oztimur... lúc arxiv.org 03-08-2024
https://arxiv.org/pdf/2311.08094.pdfYêu cầu sâu hơn