Semi-Supervised Dialogue Abstractive Summarization via High-Quality Pseudolabel Selection
Stats
Comprehensive experiments on three public datasets demonstrate the effectiveness of SiCF scores in uncertainty estimation and semi-supervised learning for dialogue summarization tasks.
SiCF (m+BNN) generally improves performance compared to random rank in both small and medium-size labeled settings.
SiCF (m+BNN) is higher than pseudo oracle in terms of ROUGE-1 and BERTScore-F in SAMSUM 1:50 and DIALOGSUM 5:50 settings.
Quotes
"SiCF score is an effective way to improve uncertainty estimation."
"Our methods surpasses pseudo oracle due to higher sample diversity."
"Using all the unlabeled dialogues is not the best choice because some samples have significant pseudolabel noise."