Core Concepts
Integrating social awareness into natural language processing (NLP) models and systems is crucial to make them more natural, helpful, and safe for diverse users and contexts.
Abstract
The content discusses the need for socially aware language technologies that can understand the social context, perspectives, and emotions expressed in human language. It argues that many issues facing modern NLP, such as bias, toxicity, and fairness concerns, stem from a lack of awareness of the social factors, social contexts, and social dynamics communicated through language.
The paper defines socially aware language technologies as the study and development of language technologies from a social perspective. It outlines three key aspects that socially aware NLP needs to account for: social factors (e.g., speaker characteristics, social relations, cultural norms), social interaction (e.g., power dynamics, trust, user expectations), and social implication (e.g., perpetuation of biases, job displacement, productivity gains).
The content then discusses considerations and a process for building socially aware NLP, including accessing diverse communities, incorporating context and interaction dynamics, and addressing ethical and social implications. It also highlights key directions for advancing socially aware NLP, such as formulating tasks that operationalize social awareness, developing computational methods to detect social awareness, building systems that exhibit social awareness, evaluating social awareness in real-world applications, and understanding the societal impact of socially aware language technologies.
The paper concludes by discussing the historical context of socially aware NLP, its connection to emotional intelligence and social intelligence, and the future of the field as it aims to move beyond traditional language processing tasks and integrate a deeper understanding of human communication and social dynamics.
Stats
"NLP has made significant strides in recent years, thanks in part to the introduction of large pretrained language models (LLMs) based on Transformers."
"Word embeddings, which represent words in a mathematical space, can, for example, inadvertently capture and reinforce biases in training data, perpetuating stereotypes and inequalities."
"Machine translation systems have been shown to generate translations with unintended biases or inaccuracies, potentially exacerbating cultural and societal misunderstandings."
Quotes
"Many of these issues facing modern NLP share a common core. Namely, they result from failing to consider language (technologies) in the context of communities, cultural and ideological differences, and social contexts."
"Social awareness is not restricted to NLP; it should be an integral and foundational component across all modalities of AI."
"Socially aware language technologies must be designed with ethical and social considerations, such as fairness, transparency, and privacy, to avoid perpetuating stereotypes or biases and to respect user privacy."