The paper proposes a novel method called Plug and Play with Prompts (PPP) to achieve controlled text generation using large language models. The key idea is to train prompt embeddings that can steer the generation of text towards a desired style or attribute, while maintaining the fluency of the generated text.
The method consists of two main components:
The prompt embeddings are trained by backpropagating the loss from the discriminator model to update the prompt embeddings, while also using a fluency loss to ensure the generated text remains coherent. This allows the prompts to learn to generate text with the desired style, without significantly degrading the fluency.
The authors evaluate PPP on four datasets covering sentiment, formality, and toxicity control. They show that PPP significantly outperforms existing plug-and-play methods like PPLM and GeDi in terms of style control, while maintaining similar fluency. Importantly, PPP can achieve this level of control using very small datasets (as low as a few hundred samples) for training the prompts.
The authors also demonstrate PPP's ability to generalize to larger, out-of-domain datasets, and its potential to mitigate the generation of harmful and toxic text by language models.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Rohan Deepak... lúc arxiv.org 04-09-2024
https://arxiv.org/pdf/2404.05143.pdfYêu cầu sâu hơn