toplogo
Sign In

A Public Dataset of Twitter/X Posts about the 2024 U.S. Presidential Election (May 1, 2024 - July 31, 2024)


Core Concepts
This paper introduces a publicly available dataset of posts from X (formerly Twitter) related to the 2024 U.S. Presidential Election, offering a valuable resource for researchers to study the impact of social media on political discourse and election integrity.
Abstract

This research paper introduces a comprehensive dataset of posts from X (formerly Twitter) related to the 2024 U.S. Presidential Election, collected between May 1, 2024, and July 31, 2024. The researchers developed a custom scraping engine called X-Scraper to gather a wide range of election-related content, including posts, metadata, and user information.

Research Objective:

The primary objective of this research is to provide a publicly available dataset that captures the dynamics of political discourse on X during the 2024 U.S. Presidential Election. This dataset aims to facilitate research on the influence of social media on public opinion, the spread of misinformation, and the role of key figures in shaping online narratives.

Methodology:

The researchers developed a custom scraping engine, X-Scraper, to collect publicly available data from X.com. The scraper utilizes targeted keywords related to the 2024 election, political figures, and emerging events. It gathers various post-specific details, including content, media, user metadata, and user interface interactions. The data collection was divided into smaller intervals to account for specific events and discourse patterns.

Key Findings:

  • Preliminary analysis of the dataset reveals the prominence of keywords like "Biden," "Trump," and "MAGA," highlighting the centrality of these figures and slogans in online political discourse.
  • The frequent use of hashtags like "#maga," "#trump2024," and "#bidenharris2024" indicates significant public engagement with key candidates and movements.
  • The dataset also highlights the popularity of platforms like YouTube and X.com for sharing multimedia content and the influence of news sites like Fox News and Breitbart in shaping political narratives.

Main Conclusions:

This dataset offers a valuable resource for researchers to analyze trends in public opinion, investigate the spread of misinformation, and examine the influence of key figures on X during the 2024 U.S. Presidential Election. The researchers acknowledge limitations related to data representativeness and plan to expand the dataset and incorporate data from other sources in future work.

Significance:

This research is significant as it provides a large-scale, publicly available dataset that can be used to study the complex interplay between social media and political processes during a crucial election cycle. The insights derived from this dataset can inform strategies to safeguard election integrity, mitigate misinformation, and promote a more informed and balanced online political discourse.

Limitations and Future Research:

The researchers acknowledge limitations related to the representativeness of data collected solely from X and potential biases introduced by keyword-based scraping. Future work will focus on continuous data collection, analysis of verified users and suspected bots, and incorporating data from other social media platforms to provide a more comprehensive understanding of online political discourse.

edit_icon

Customize Summary

edit_icon

Rewrite with AI

edit_icon

Generate Citations

translate_icon

Translate Source

visual_icon

Generate MindMap

visit_icon

Visit Source

Stats
The dataset comprises 22 million publicly available posts on X.com. The data was collected from May 1, 2024, to July 31, 2024. Donald Trump had 92 million followers on X at the time of the study. Kamala Harris had 21 million followers on X at the time of the study. X had 611 million monthly active users at the time of the study. The scraper's data retrieval rate was capped at approximately 2,300 posts per hour per account.
Quotes
"Social media has become an influential force in 21st-century politics globally. X.com (formerly Twitter) has been particularly significant in shaping political tensions and public opinion, offering researchers a valuable resource for studying the ideologies that are shared, the spread of misinformation, and the online campaigns supporting political movements and candidates." "Given the polarized nature of the 2024 election cycle and the impact of social media on shaping public perceptions, this dataset has significant potential to help understanding how information and narratives are shared and propagated." "Insights derived from this dataset could directly inform strategies to safeguard election integrity, mitigate misinformation, and assess the influence of prominent voices on digital political discourse."

Deeper Inquiries

How can social media platforms be used to promote constructive political dialogue and mitigate the spread of misinformation during elections?

Social media platforms, while often criticized for exacerbating polarization and misinformation, possess the potential to foster constructive political dialogue and curb the spread of falsehoods during elections. Here's how: Promoting Constructive Dialogue: Facilitating Fact-Checking and Source Verification: Platforms can partner with independent fact-checking organizations to flag potentially misleading content and provide users with contextual information from reliable sources. This can help users critically evaluate information and make more informed judgments. Creating Spaces for Deliberative Discussions: Platforms can design features that encourage nuanced conversations and respectful disagreements. This could involve promoting moderated forums, facilitating issue-based discussions, or highlighting content that presents diverse perspectives. Empowering Users with Media Literacy Tools: Platforms can equip users with the skills to identify misinformation, understand online manipulation tactics, and engage in responsible content sharing. This could involve interactive tutorials, in-app prompts, or partnerships with media literacy organizations. Mitigating the Spread of Misinformation: Enhancing Content Moderation Policies: Platforms need to establish clear and transparent policies for identifying and removing harmful content, including hate speech, incitements to violence, and demonstrably false information. This requires investing in robust content moderation systems and human review processes. Limiting the Viral Spread of Misinformation: Platforms can implement algorithms that slow down the spread of content flagged as potentially misleading. This could involve limiting the visibility of such content, adding friction to sharing mechanisms, or prioritizing authoritative sources in recommendations. Increasing Transparency and Accountability: Platforms should be more transparent about their content moderation practices, algorithmic decisions, and efforts to combat misinformation. This includes providing users with clear explanations for content removals, publishing regular transparency reports, and engaging with researchers and civil society organizations. By implementing these measures, social media platforms can play a more responsible role in fostering a healthier information ecosystem during elections.

Could the exclusion of data from other social media platforms limit the generalizability of the findings to the broader online political landscape?

Yes, the exclusion of data from other social media platforms can significantly limit the generalizability of findings to the broader online political landscape. Here's why: Platform Heterogeneity: Each social media platform has its own unique user demographics, platform features, and content moderation policies. These differences can influence the type of political discourse that takes place, the spread of information, and the impact of misinformation. Echo Chambers and Filter Bubbles: Users on different platforms may be exposed to different perspectives and information, leading to the formation of echo chambers and filter bubbles. This can result in skewed perceptions of public opinion and limited exposure to diverse viewpoints. Cross-Platform Information Flows: Political information and misinformation often flow across multiple platforms. Studying one platform in isolation fails to capture the interconnected nature of online political communication and the potential for cross-platform manipulation campaigns. To enhance the generalizability of findings, researchers should strive for: Multi-Platform Studies: Conducting research across multiple platforms provides a more comprehensive understanding of online political discourse and the spread of misinformation. Comparative Analysis: Comparing and contrasting findings across platforms can reveal platform-specific dynamics and highlight common trends in online political communication. Network Analysis: Examining the flow of information and interactions across platforms can uncover cross-platform influence networks and coordinated manipulation campaigns. By adopting these approaches, researchers can gain a more holistic and generalizable understanding of the online political landscape.

What are the ethical implications of using large-scale social media data for research, and how can researchers ensure the privacy and anonymity of users?

Using large-scale social media data for research presents significant ethical implications, particularly concerning user privacy and anonymity. Researchers must navigate these challenges responsibly by: Ethical Implications: Privacy Violations: Social media data often contains sensitive personal information, including political views, religious beliefs, and social connections. Unauthorized access or disclosure of this data can have severe consequences for individuals. Informed Consent: Obtaining meaningful informed consent from users whose data is being used for research is crucial. However, this can be challenging given the vast scale of social media data and the often opaque nature of data collection practices. Data Security: Researchers have a responsibility to protect collected data from unauthorized access, use, or disclosure. This requires implementing robust data security measures and adhering to ethical data handling practices. Potential for Harm: Research findings based on social media data can be misinterpreted or misused to harm individuals or groups. Researchers must consider the potential societal impact of their work and take steps to mitigate potential harms. Ensuring Privacy and Anonymity: Data Anonymization: Researchers should anonymize data whenever possible, removing or obfuscating personally identifiable information. This includes using techniques like pseudonymization, aggregation, and differential privacy. Data Minimization: Researchers should only collect and retain the data that is strictly necessary for their research purposes. This minimizes the potential privacy risks associated with data storage and analysis. Transparent Data Practices: Researchers should be transparent about their data collection, analysis, and storage practices. This includes providing clear privacy policies, disclosing data sharing agreements, and engaging in open science practices. Ethical Review Boards: Researchers should seek ethical approval from institutional review boards (IRBs) before conducting research involving human subjects' data. IRBs can provide guidance on ethical data collection and analysis practices. By adhering to these ethical principles and best practices, researchers can leverage the insights from large-scale social media data while safeguarding the privacy and anonymity of users.
0
star