Core Concepts
This research introduces a novel guided summarization model, enhanced with domain-specific knowledge and a post-editing correction mechanism, to generate more relevant and faithful summaries of mental health posts.
Stats
The MENTSUM dataset comprises over 24k post-TL;DR pairs, divided into 21,695 training, 1,209 validation, and 1,215 test instances.
On average, each post contains 327.5 words or 16.9 sentences, while TL;DR consists of 43.5 words or 2.6 sentences.
GSUM-TERM achieves a 1.5% higher FactCC score than BART and a 1.6% higher score than GSUM.
GSUM-SENT achieves a 2.7% higher FactCC score compared to BART and a 2.8% improvement over GSUM.
Only 10.3% of the summaries generated by GSUM-SENT undergo revisions by the corrector.
92.8% of the corrected summaries incorporate three or fewer new tokens, despite the summary averaging 53.27 tokens in length.
Quotes
"Mental health is a critical area that profoundly affects both individuals and society, demanding effective and accurate communication for support."
"The summary enables quicker review and response by professional counselors, thus enhancing support for individuals dealing with mental health issues and demonstrating significant social impact."
"This design is specifically tailored to enhance the summarization process within mental health contexts, guiding the generation of a summary that is both terminologically precise and richly informed by the underlying domain-specific information contained within the original text."