MASIVE: A New Benchmark Dataset and Approach for Identifying Open-Ended Affective States in Text (English and Spanish)
This research paper introduces a novel task called Affective State Identification (ASI) for identifying a wide range of emotions and moods in text, moving beyond limited emotion categories. It presents a new benchmark dataset, MASIVE, collected from Reddit, containing over 1,000 unique affective state labels in English and Spanish. The authors demonstrate that fine-tuned smaller language models outperform larger language models on ASI tasks and that training on MASIVE improves performance on traditional emotion detection benchmarks. The paper highlights the importance of native-language data for accurate affective state identification and suggests future research directions for this new field.