Identifying hateful memes and their targeted entities in low-resource languages like Bengali is crucial for understanding social dynamics and countering hate speech.
Recognizing speaking in humans using multimodal signals for privacy-preserving segmentation.
The author presents MM-AU, a dataset for Multi-Modal Accident video Understanding, supporting various accident understanding tasks. The AdVersa-SD framework utilizes an abductive approach for safe driving perception.