A novel method, Plug and Play with Prompts (PPP), that utilizes prompt tuning to steer the generation of text by large language models in a data and parameter efficient manner.
A low-rank autoregressive reward model can efficiently guide text generation from a base language model while maintaining comparable performance to a more flexible but computationally intensive reward model.