Khái niệm cốt lõi
Large language models struggle to effectively distill text by removing forbidden variables while preserving other semantic content, posing challenges for computational social science investigations.
Thống kê
"The dataset is approximately balanced in both sentiment and topic labels."
"Sentiment Accuracy ↓"
"Topic Accuracy ↑"
"Train Test Original Reviews LLM Classifier Accuracy"
"Train Test Distilled Reviews Classifier Accuracy"
Trích dẫn
"No distillation"
"Mean projection"
"Mistral 7B"
"GPT 4"