المفاهيم الأساسية
Unsupervised algorithmic methods leveraging optical flow can outperform supervised neural network models for generic event boundary detection in videos.
الملخص
The paper proposes FlowGEBD, an unsupervised and non-parametric approach for generic event boundary detection in videos. It introduces two algorithms:
-
Pixel Tracking (PT): This method tracks a sparse set of pixels across frames using sparse optical flow and identifies event boundaries based on significant changes in the number of tracked pixels.
-
Flow Normalization (FN): This method computes dense optical flow for each frame, aggregates the maximum flow for each patch, normalizes the flow over time, and identifies event boundaries based on high normalized flow values.
The authors conduct extensive experiments on the challenging Kinetics-GEBD and TAPOS datasets. Key findings:
- FlowGEBD, the ensemble of PT and FN, achieves state-of-the-art results among unsupervised methods on Kinetics-GEBD, outperforming supervised neural network baselines.
- FlowGEBD obtains an F1@0.05 score of 0.713 on Kinetics-GEBD, a 31.7% absolute gain over the unsupervised baseline.
- On the TAPOS dataset, FlowGEBD achieves an average F1 score of 0.623, an 8% improvement over the unsupervised baseline.
- The proposed methods are non-parametric, computationally efficient, and robust to threshold variations, making them suitable for real-world applications.
الإحصائيات
Video has accounted for 82.5% of all web traffic in 2023, making it the most popular form of content on the internet.
The Kinetics-GEBD dataset contains 54,691 videos of 10 seconds each, spanning a broad spectrum of video domains.
The TAPOS dataset contains 1,790 instances of Olympic sports videos for the validation set.
اقتباسات
"Generic Event Boundary Detection (GEBD) task aims to recognize generic, taxonomy-free boundaries that segment a video into meaningful events."
"Our method FlowGEBD achieves state-of-the-art results among unsupervised methods compared to non-parametric and parametric benchmarks."
"FlowGEBD exceeds the neural models on the Kinetics-GEBD dataset by obtaining an F1@0.05 score of 0.713 with an absolute gain of 31.7% compared to the unsupervised baseline."