핵심 개념
IMPOSSIBLE DISTILLATION distills high-quality paraphrase datasets and models from low-quality LMs using paraphrastic proximity and critic-guided filtering.
통계
IMPOSSIBLE DISTILLATION produces a high-quality dataset even from GPT2-scale LMs.
Our model with 770M parameters consistently outperforms strong baselines in multiple benchmarks.
The distilled dataset from 1.5B LMs shows better metrics than state-of-the-art datasets like ParaBank or ChatGPT-Para.