Community models for malicious content detection on social media graphs often perform well on benchmark datasets but struggle to generalize to new graphs, domains, and tasks. A novel few-shot subgraph sampling approach is proposed to better assess inductive generalization capabilities of these models.
Hate speech detection models evaluated on biased datasets largely overestimate real-world performance on representative Nigerian Twitter data. Domain-adaptive pretraining and finetuning on diverse data are key to maximizing hate speech detection in this low-resource context.