Core Concepts
Incorporating temporal and emotional features into deep learning models can significantly improve the performance of pathological gambling detection on social media.
Abstract
The paper addresses the problem of predicting pathological gambling behavior using social media data, specifically from Reddit. The authors explore various deep learning architectures and techniques to tackle the challenges of class imbalance, temporal irregularity, and interpretability.
Key highlights:
- Baseline models: A text-based BERT classifier on concatenated posts and a sequential GRU+LSTM model on post sequences.
- Proposed model: Incorporates BERT and EmoBERTa embeddings, a time decay layer, and an attention mechanism to capture temporal and emotional cues.
- Experiments show that the sequential models outperform the concatenation-based approach, and the inclusion of time decay and emotion features significantly boosts performance.
- The attention mechanism provides interpretability, allowing the model's focus on relevant parts of the text to be analyzed.
- The proposed model achieves state-of-the-art results on the eRisk pathological gambling dataset, outperforming existing benchmarks.
- Limitations include the small size of the dataset and the need for further testing on other mental health datasets to assess generalizability.
Stats
The dataset contains 4,384 users, with 245 positive (pathological gambling) and 4,139 negative labels. The minimum number of posts per user is 3, the maximum is 2,002, and the average is 520.
Quotes
"The incorporation of a time decay layer (TD) and passing the emotion classification layer (EmoBERTa) through LSTM improves the performance significantly."
"The developed architecture with the inclusion of EmoBERTa and TD layers achieved a high F1 score, beating existing benchmarks on pathological gambling dataset."