Conceitos Básicos
Large language models can detect arguments that would be particularly persuasive to individuals with specific demographics or beliefs, indicating their potential to generate targeted misinformation and propaganda.
Resumo
The study investigates whether large language models (LLMs) can detect content that would be persuasive to individuals with specific demographics or beliefs. The key findings are:
-
Argument Quality (RQ1):
- GPT-4 performs on par with humans in judging the quality of arguments, identifying convincing arguments.
- Other LLMs like Llama 2 perform worse than random guessing on this task.
-
Correlating Beliefs and Demographics with Stances (RQ2):
- LLMs perform similarly to crowdworkers in predicting individuals' stances on specific topics based on their demographics and beliefs.
- A supervised machine learning model (XGBoost) outperforms the LLMs on this task.
-
Recognizing Persuasive Arguments (RQ3):
- LLMs perform similarly to crowdworkers in predicting individuals' stances after reading a debate.
-
Stacking LLM Predictions:
- Combining predictions from multiple LLMs improves performance, even surpassing human-level accuracy in RQ2 and RQ3.
The results suggest that LLMs can detect personalized persuasive content, indicating their potential to generate targeted misinformation and propaganda. The authors argue that this provides an efficient framework to continuously benchmark the persuasive capabilities of LLMs as they evolve.
Estatísticas
"78% of Black, 72% of Asian, and 65% of Hispanic workers see efforts on increasing diversity, equity, and inclusion at work positively, compared to 47% of White workers." (Minkin, 2023)
"Tailoring messages to different personality traits can make them more persuasive" (Hirsh et al., 2012)
"Men and women differ significantly in their responsiveness to different persuasive strategies" (Orji et al., 2015)
Citações
"If LLMs can detect good arguments (RQ1), determine the correlation between demographics and previously stated beliefs with people's stances on new specific topics (RQ2), and determine whether an argument will convince specific individuals (RQ3), they are likely better at generating misinformation and propaganda."