The article introduces P2ANet, a benchmark dataset for dense action detection in table tennis videos. It discusses the challenges of recognizing and localizing fast-moving actions in sports videos, particularly in table tennis. The dataset consists of 2,721 video clips from professional matches, annotated with fine-grained action labels. Various action recognition and localization models are evaluated on P2ANet, highlighting the difficulty of achieving high accuracy due to the dense and fast nature of the actions. The article also details the dataset construction, annotation process, and the development of a specialized annotation toolbox for efficient labeling.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Jiang Bian,X... klokken arxiv.org 03-27-2024
https://arxiv.org/pdf/2207.12730.pdfDypere Spørsmål