The article introduces P2ANet, a benchmark dataset for dense action detection in table tennis videos. It discusses the challenges of recognizing and localizing fast-moving actions in sports videos, particularly in table tennis. The dataset consists of 2,721 video clips from professional matches, annotated with fine-grained action labels. Various action recognition and localization models are evaluated on P2ANet, highlighting the difficulty of achieving high accuracy due to the dense and fast nature of the actions. The article also details the dataset construction, annotation process, and the development of a specialized annotation toolbox for efficient labeling.
Para Outro Idioma
do conteúdo original
arxiv.org
Principais Insights Extraídos De
by Jiang Bian,X... às arxiv.org 03-27-2024
https://arxiv.org/pdf/2207.12730.pdfPerguntas Mais Profundas