SkelVIT introduces a novel approach to skeleton-based action recognition by combining pseudo-image representation with vision transformers. The study compares SkelVIT with state-of-the-art methods, demonstrating superior performance. Additionally, the research delves into the sensitivity of VITs compared to CNN models and explores the impact of ensemble classifiers on recognition accuracy.
The content discusses the significance of different representation schemes in action recognition and evaluates the effectiveness of VITs in improving classification results. Through detailed experiments and comparisons, SkelVIT emerges as a promising solution for efficient and accurate skeleton-based action recognition.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Ozge Oztimur... alle arxiv.org 03-08-2024
https://arxiv.org/pdf/2311.08094.pdfDomande più approfondite