The study evaluates various Twitter sampling methods to create a representative sample of US Twitter users. It analyzes tweet- and user-level metrics, demographic inferences, and population estimation accuracy. The 1% Stream method emerges as the most suitable for creating a nationally representative sample.
The research explores challenges in obtaining random samples from Twitter data and discusses the importance of accurate sampling methods. It highlights the significance of demographic inference and debiasing techniques in creating representative samples. The study emphasizes the need for robust methodologies to ensure accurate population estimates from social media data.
Key findings include differences in tweet generation, account characteristics, gender distribution, age demographics, and geographical representation across different sampling methods. The results underscore the effectiveness of the 1% Stream method in providing reliable population estimates compared to other sampling approaches.
Overall, the study sheds light on best practices for creating national random samples of Twitter users and emphasizes the importance of accurate demographic inference and debiasing techniques in social media research.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문