Conceitos essenciais
Incorporating both image and non-image features, such as captions, user information, and temporal data, can significantly improve the accuracy of predicting the popularity of social media posts.
Resumo
The study presents a framework for predicting the popularity of image-based social media content by effectively extracting key image and color information from users' postings and exploring a wide range of prediction models.
Key highlights:
The study utilizes the Google Cloud Vision API to extract image labels and dominant colors, which are then used as covariates in the prediction models.
The authors employ Seeded Latent Dirichlet Allocation (Seeded-LDA) to summarize the image contents into interpretable topic variables.
The study compares the performance of various models, including Linear Mixed Model, Support Vector Regression, Multi-layer Perceptron, Random Forest, and XGBoost, with linear regression as the benchmark.
The results demonstrate that models capable of capturing the underlying nonlinear interactions between covariates, such as Random Forest and XGBoost, outperform other methods.
Incorporating both image and non-image features, such as captions, user information, and temporal data, can significantly improve the accuracy of predicting the popularity of social media posts compared to using only non-image covariates.
The proposed methods are practical and interpretable, providing direct implications for real-world applications.
Estatísticas
The number of images per post is an important factor in predicting the popularity of social media posts.
The time difference between the post upload and data crawling is a key variable that needs to be accounted for when predicting post popularity.
The number of hashtags used in a post's caption is a significant predictor of its popularity.
Citações
"Our study presents a framework for predicting image-based social media content popularity that focuses on addressing complex image information and a hierarchical data structure."
"Experimental results demonstrate that the proposed interpretable variables achieve comparable performance to features constructed by various existing methods, including embedding vectors extracted using deep learning."
"Our considered methods are both practical and interpretable, yielding direct implications for real-world applications."