Core Concepts
Position bias in summarization models can severely impact the fairness of generated summaries, especially when summarizing diverse social data.
Abstract
The paper investigates the phenomenon of position bias in the context of social multi-document summarization. The key findings are:
When the input documents are presented in a randomly shuffled order, the summarization models do not exhibit any notable position bias, neither in human-written reference summaries nor in system-generated summaries.
However, when the input documents are ordered based on the dialect groups they belong to, the summarization models show a significant position bias, favoring the group whose documents appear first in the input.
This position bias in ordered inputs has a severe impact on the fairness of the generated summaries, with the group whose documents appear first being significantly over-represented compared to the other groups. In contrast, the shuffled input leads to more balanced summaries across all groups.
Interestingly, the textual quality of the summaries, as measured by standard metrics like ROUGE, BARTScore, BERTScore, and UniEval, remains largely consistent regardless of whether the input is ordered or shuffled.
The findings highlight the importance of considering fairness, in addition to textual quality, when evaluating summarization models, especially in the context of diverse social data. The results suggest that introducing randomness in the presentation of input data can help mitigate the position bias and improve the fairness of the generated summaries.
Stats
The average token overlap between the system-generated summaries and each document in the input set is used to quantify position bias.