The article introduces P2ANet, a benchmark dataset for dense action detection in table tennis videos. It discusses the challenges of recognizing and localizing fast-moving actions in sports videos, particularly in table tennis. The dataset consists of 2,721 video clips from professional matches, annotated with fine-grained action labels. Various action recognition and localization models are evaluated on P2ANet, highlighting the difficulty of achieving high accuracy due to the dense and fast nature of the actions. The article also details the dataset construction, annotation process, and the development of a specialized annotation toolbox for efficient labeling.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Jiang Bian,X... alle arxiv.org 03-27-2024
https://arxiv.org/pdf/2207.12730.pdfDomande più approfondite