The rapid advancement of large language models (LLMs) has led to concerns about societal bias, especially in online offensive language. This paper investigates ethnic, gender, and racial biases in a model fine-tuned with Korean comments using Bidirectional Encoder Representations from Transformers (KcBERT) and KOLD data. The study quantitatively evaluates biases using LPBS and CBS metrics, showing a reduction in ethnic bias but significant changes in gender and racial biases. Two methods are proposed to mitigate societal bias: data balancing during pre-training and Debiasing Regularization during training. Experimental analysis highlights the need for preemptive measures in bias mitigation.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by J. K. Lee,T.... at arxiv.org 03-19-2024
https://arxiv.org/pdf/2403.10774.pdfDeeper Inquiries