The paper explores the impact of incorporating sentiment, emotion, and domain-specific lexicons into transformer-based models for depression symptom estimation. The authors use the DAIC-WOZ dataset, which contains patient-therapist conversations with associated PHQ-8 scores, and the PRIMATE dataset, which contains Reddit posts annotated with PHQ-9 symptoms.
The key findings are:
For the DAIC-WOZ dataset, the introduction of lexicon information, especially sentiment and emotion lexicons, can improve the performance of transformer-based models in predicting individual depression symptoms and the overall PHQ-8 score. The combination of all three lexicons (AFINN, NRC, and SDD) yields the best results for the MentalBERT model.
For the PRIMATE dataset, the impact of lexicon information is less pronounced, with only slight improvements observed for some symptoms. The SDD lexicon, which is specific to depression, provides the best results for some symptoms, while the benefits of AFINN and NRC are more limited.
The authors hypothesize that the difference in results between the two datasets is due to the conceptual differences between them. The DAIC-WOZ dataset aims to establish a link between a person's mental condition and their speech, while the PRIMATE dataset focuses on detecting whether a particular symptom is mentioned in the text.
The paper highlights the importance of adapting the external knowledge (lexicons) to the targeted task for optimal performance improvement.
To Another Language
from source content
arxiv.org
Deeper Inquiries