Core Concepts
Large Language Models exhibit bias towards protected groups, amplifying societal biases and stereotypes.
Abstract
The content explores the investigation of bias in Large Language Models (LLMs) concerning protected group categories like gender, sexuality, religion, and race. The study involves prompting LLMs to generate responses related to occupations and stories about individuals from different groups. It reveals pervasive bias across minoritized groups, particularly in gender and sexuality domains, along with a Western bias. The model's tendency to overemphasize diversity and equity while overshadowing other group characteristics raises concerns about potential harm.
Directory:
- Abstract
- Investigates behavior of LLMs in ethics and fairness domains.
- Study includes sentence completions and story generations.
- Introduction
- Explosion of Large Language Models adoption.
- Concerns regarding perpetuation of biases.
- Related Work
- Extensive documentation of biases in language models.
- Methodology
- Tasks conducted to test bias through prompt continuations and free generated text.
- Results
- Analysis of model responses for bias presence across different protected group categories.
- Discussion
- Findings highlight significant bias in model generations.
- Limitations
- Study limitations include restricted categories and values examined.
Stats
"We collect >10k sentence completions made by a publicly available LLM."
"In all, only 33% of responses were adjudged devoid of bias."
"95% of stories contained one male protagonist and one female protagonist."
Quotes
"The fact that the model 'over-corrects' by generating a substantial proportion of responses that were judged as biased or which contained allusions to a broad category of 'diversity' is itself problematic."
"Only white, straight, non-religious, cis men received occupation suggestions that did not pigeon-hole them according to their group characteristics."