Основные понятия
Societal bias exists in Korean language models due to language-dependent characteristics.
Аннотация
The rapid advancement of large language models (LLMs) has led to concerns about societal bias, especially in online offensive language. This paper investigates ethnic, gender, and racial biases in a model fine-tuned with Korean comments using Bidirectional Encoder Representations from Transformers (KcBERT) and KOLD data. The study quantitatively evaluates biases using LPBS and CBS metrics, showing a reduction in ethnic bias but significant changes in gender and racial biases. Two methods are proposed to mitigate societal bias: data balancing during pre-training and Debiasing Regularization during training. Experimental analysis highlights the need for preemptive measures in bias mitigation.
Статистика
LPBS adopts a template-based approach similar to DisCo, calculating the bias degree by comparing the probabilities of predicting a specific attribute or target when the [MASK] token is predicted by LLMs.
CBS generalizes metrics for multi-class targets and measures the variance of bias scores normalized by the logarithm of probabilities.
Цитаты
"We define such harm as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments."
"Our contribution lies in demonstrating that societal bias exists in Korean language models due to language-dependent characteristics."
"Experimental analysis comparing the biases of the two models through Korean demonstrates the need for preemptive measures in bias mitigation."