The article introduces P2ANet, a benchmark dataset for dense action detection in table tennis videos. It discusses the challenges of recognizing and localizing fast-moving actions in sports videos, particularly in table tennis. The dataset consists of 2,721 video clips from professional matches, annotated with fine-grained action labels. Various action recognition and localization models are evaluated on P2ANet, highlighting the difficulty of achieving high accuracy due to the dense and fast nature of the actions. The article also details the dataset construction, annotation process, and the development of a specialized annotation toolbox for efficient labeling.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Jiang Bian,X... a las arxiv.org 03-27-2024
https://arxiv.org/pdf/2207.12730.pdfConsultas más profundas