Alapfogalmak
Leveraging neural models to learn hidden representations of individual rumor-related tweets at the very beginning of a rumor, which improves classification performance over time, significantly within the first 10 hours.
Kivonat
The paper presents a comprehensive approach for early rumor detection on Twitter. Key highlights:
The authors leverage neural models, specifically CNN and LSTM, to learn hidden representations of individual rumor-related tweets at the very beginning of a rumor. This helps capture more meaningful signals than just using enquiries or aggregated tweet content.
The authors build a cascaded model that combines the tweet-level credibility scores with a wide range of low and high-level features, including text, user, Twitter, and epidemiological features. This model is structured as a Dynamic Series-Time Structure (DSTS) to capture the temporal dynamics of the features.
The authors conduct an extensive study on the impact of different feature groups over time. They find that text features, CreditScore, and CrowdWisdom are the most effective features, especially in the early stages of rumor spreading.
The authors compare their automated system with human experts and show that within 25 hours, their model achieves 87% accuracy, outperforming the average time taken by human editors to debunk rumors.
Overall, the paper presents a comprehensive and effective approach for early rumor detection on Twitter, leveraging both low-level tweet representations and high-level features in a cascaded model.
Statisztikák
"Rumors are wildfires that are difficult to put out and traditional news sources or official channels, such as police departments, subsequently struggle to communicate verified information to the public, as it gets lost under the flurry of false information."
"Within 25 hours–the average time for human editors to debunk rumors–we achieve 87% accuracy."
Idézetek
"Our intuition is to leverage the "wisdom of the crowd" theory; such that even a certain portion of tweets at a moment (mostly early stage) are weakly predicted (because of these noisy factors), the ensemble of them would attribute to a stronger prediction."
"Aggregating all relevant tweets of the event at this point can be of noisy and harm the classification performance."