Khái niệm cốt lõi
SNRO, a novel framework for video class-incremental learning, slightly shifts the features of new classes during their training stage to greatly improve the performance of old classes, while consuming the same memory as existing methods.
Tóm tắt
The authors propose a novel framework called SNRO for video class-incremental learning. SNRO consists of two key components:
-
Examples Sparse:
- Sparse Extract: SNRO decimates the videos of old classes at a lower sample rate, storing a larger memory set under the same memory consumption.
- Frame Alignment: SNRO uses interpolation to align the sparse frames with the network input, reducing the spatio-temporal information of the video representation.
-
Early Break:
- SNRO terminates the training at a small epoch during the incremental training stage, preventing the model from over-stretching to the newly seen classes.
By slightly dropping the performance of the current task, SNRO greatly improves the performance of previous tasks, effectively alleviating the catastrophic forgetting of old classes. Experiments on UCF101, HMDB51, and UESTC-MMEA-CL datasets demonstrate the effectiveness of SNRO compared to state-of-the-art methods.
Thống kê
Recent video class-incremental learning usually excessively pursues the accuracy of the newly seen classes and relies on memory sets to mitigate catastrophic forgetting of the old classes.
Limited storage only allows storing a few representative videos.
Trích dẫn
"SNRO significantly alleviates the catastrophic forgetting of old classes at the cost of slightly drop the performance of the current new classes, thereby improving the overall recognition accuracy."
"Examples Sparse ensures we build larger memory sets consuming the same space as TCD. And using F/2 frames to represent a video contains less spatio-temporal information than using F frames, it effectively prevents the network from over-stretching to high-semantic spaces, which allows preserving more low semantic features in future incremental tasks."
"Early Break effectively prevents the tendency of over-fit to new classes, achieving a 0.73% CNN improvement with the same memory set construction method."