AXOLOTL introduces a novel post-processing framework for mitigating biases in Large Language Model outputs, operating agnostically across tasks and models to self-debias its outputs efficiently.
Large Language Models are prone to biases, but AXOLOTL offers a novel post-processing framework for self-debiasing, ensuring fairness and performance preservation.
This survey provides a comprehensive overview of recent advances in addressing bias and promoting fairness in large language models (LLMs). It explores definitions of fairness, techniques for quantifying bias, and algorithms for mitigating bias at different stages of the LLM workflow. The survey also summarizes available resources, including toolkits and datasets, to facilitate further research and development of fair LLMs.