Soft-prompt Tuning for Large Language Models to Evaluate Bias
Stats
Prompting large language models (LLMs) has gained substantial popularity as pre-trained LLMs are capable of performing downstream tasks without requiring large quantities of labelled data.
Language model prompting addresses some of these downsides, but the task of designing prompts to induce optimal performance for a given downstream application is challenging.
Soft-prompt tuning has been shown to match, or nearly match, fine-tuning performance for various tasks such as classification, summarization, and question-answering.
Bias quantification has gained substantial attention from the research community recently.
The bias metrics of positive and negative false-positive rate gaps are explored here.
Quotes
"Prompting large language models (LLMs) has gained substantial popularity as pre-trained LLMs are capable of performing downstream tasks without requiring large quantities of labelled data."
"Soft-prompt tuning has been shown to match, or nearly match, fine-tuning performance for various tasks such as classification, summarization, and question-answering."
"Bias quantification has gained substantial attention from the research community recently."