The paper proposes a method called Demo2Vec that integrates demographic information, such as income, age, education level, and employment rate, into the learning of region embedding. The key insights are:
Demographic data contains valuable information about urban regions that is often overlooked in existing region embedding approaches. The authors show that incorporating demographic data, especially income information, can improve the predictive performance of region embedding across three common urban tasks: check-in prediction, crime rate prediction, and house price prediction.
The authors find that existing pre-training methods based on KL divergence are potentially biased towards mobility information. They propose using Jenson-Shannon divergence as a more appropriate loss function for multi-view representation learning, as it generates comparable loss values for all pertaining dimensions, leading to a more stable training process.
Experimental results on datasets from New York City and Chicago demonstrate that the combination of mobility and income data achieves the best overall performance, providing up to 10.22% better predictive accuracy than existing models. For cities without access to fine-grained mobility data, the authors suggest using geographic proximity and income as an effective alternative data combination for region embedding pre-training.
The authors also explore the effects of incorporating other demographic attributes, such as age, education level, employment rate, and the percentage of foreign-born population. The results show that the effectiveness of different demographic features varies across tasks and cities, highlighting the context-aware and city-specific nature of the relationships between demographic characteristics and urban dynamics.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문