The REFLECTSUMM dataset was created to address the need for benchmarks that better represent real-life applications of summarization, particularly in underexplored domains. The dataset contains 17,512 student reflections on 782 university lectures from 24 large STEM classes, spanning four subjects: Engineering, Physics, Computer Science, and Computing Information.
The dataset provides three types of reference summaries for each set of reflections: extractive, abstractive, and phrase-level extractive. Additionally, it includes valuable metadata such as reflection-level specificity scores and student demographic information.
The authors conducted extensive evaluations using multiple state-of-the-art baselines, including pretrained language models and large language models, to benchmark the dataset across the three summarization tasks. The results provide insights into the performance and limitations of current summarization techniques in the context of student reflections.
The authors highlight the potential of the REFLECTSUMM dataset to enable further research in areas such as fairness and equity in summarization, as well as the exploration of specificity-aware summarization. The dataset and associated resources are publicly available, encouraging researchers to build upon this work and advance the field of summarization in educational and opinion-based domains.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yang Zhong,M... at arxiv.org 03-29-2024
https://arxiv.org/pdf/2403.19012.pdfDeeper Inquiries