Conceitos essenciais
Large language models like GPT-4 can outperform human evaluators in consistently identifying subtle user intentions to quit vaping on social media platforms.
Resumo
This study explores the use of large language models, including the latest GPT-4 and traditional BERT-based models, to analyze Reddit user posts from the r/QuitVaping subreddit and identify those considering vaping cessation.
The key highlights are:
A sample dataset of 1,000 Reddit posts was extracted, with 120 randomly selected posts annotated by human evaluators to label sentences as indicating a "quit vaping" intention or not.
The human-annotated dataset was used to fine-tune several BERT-based language models, including BioBERT, DistilBERT, and RedditBERT, for a binary classification task to predict quit vaping intentions.
The GPT-4 model was also evaluated on the same task, with the researchers finding that GPT-4 demonstrated better consistency in adhering to the annotation guidelines compared to the human evaluators. GPT-4 was able to detect more nuanced quit vaping intentions that the human annotators may have overlooked.
The BERT-based models achieved high overall accuracy (up to 95%) but struggled to correctly identify the positive "quit vaping" class, with the best model (BioBERT) having a recall of only 60% on that class.
The findings highlight the potential of advanced large language models like GPT-4 in enhancing the accuracy and reliability of social media data analysis, especially for identifying subtle user intentions that may be difficult for human evaluators to detect.
Estatísticas
The average number of sentences per Reddit post is 9.02, with an average of 157.74 words per post.
Citações
"Notably, when compared to human evaluators, GPT-4 model demonstrates superior consistency in adhering to annotation guidelines and processes, showcasing advanced capabilities to detect nuanced user quit-vaping intentions that human evaluators might overlook."