Efficient Joint Stream Embedding Network for Effective Violence Detection in Surveillance Videos
JOSENet, a novel self-supervised framework, provides outstanding performance for violence detection in surveillance videos by leveraging a joint stream embedding network and a regularized self-supervised learning approach.