The paper introduces REQUAL-LM, a novel method for finding reliable and equitable outputs from Large Language Models (LLMs) through aggregation. LLMs are randomized and can exhibit inherent biases, raising concerns about reliability and equity in their application.
REQUAL-LM addresses these challenges by using a Monte-carlo method based on repeated sampling to find a reliable output close to the mean of the underlying distribution of possible outputs. It formally defines reliability and bias, and designs an equity-aware aggregation to minimize harmful bias while finding a highly reliable output.
REQUAL-LM treats the LLM as a black-box, enabling seamless scalability alongside the rapid advancement of LLM technologies. It does not require retraining the LLMs, making it deployment-ready and easy to adapt.
The paper presents comprehensive experiments on various tasks and datasets, demonstrating that REQUAL-LM effectively mitigates bias and selects a more equitable response that properly represents minority groups. It outperforms baseline models in terms of reliability and equity.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Sana Ebrahim... at arxiv.org 04-19-2024
https://arxiv.org/pdf/2404.11782.pdfDeeper Inquiries