Action Detection via a Diffusion-based Image Generation Process
The core message of this paper is that action detection can be effectively tackled by formulating it as a three-image generation problem, where the starting point, ending point, and action-class predictions are generated as images via a diffusion-based framework.