This research paper introduces a novel framework for online temporal action segmentation (TAS) in videos, featuring an adaptive memory bank to capture temporal context and a context-aware feature augmentation module to enhance frame representations, leading to state-of-the-art performance in online action segmentation.
Introducing two methods, surround dense sampling and Online Temporally Aware Label Cleaning (O-TALC), to improve the performance of online temporal action segmentation by addressing the issues of inaccurate segment boundaries and oversegmentation.