This study explores the feasibility of a federated approach to epidemic surveillance, where crucial data is fragmented across multiple institutions. The key idea is to conduct hypothesis tests for a rise in counts behind each custodian's firewall and then combine the resulting p-values using meta-analysis techniques, without needing to share the underlying data.
The authors propose a hypothesis testing framework to identify surges in epidemic-related data streams and conduct experiments on real and semi-synthetic data to assess the power of different p-value combination methods. The findings show that relatively simple combination methods, such as Stouffer's and Fisher's methods, can achieve a high degree of fidelity in detecting surges without needing to share even aggregate data across institutions.
The authors also explore how the performance of the different meta-analysis methods is impacted by factors like the number of reporting sites, their relative sizes, and the expected magnitude of the counts. They find that Stouffer's method performs best when data is concentrated in a smaller number of sites and the magnitude of reports is relatively large, while Fisher's method exhibits robustness in more challenging settings with a larger number of data holders and greater imbalances in their shares.
Additionally, the authors demonstrate that incorporating auxiliary information, such as the sites' shares and estimated total counts within a given region, can further improve the performance of the federated surveillance framework. The results suggest that effective infectious disease outbreak detection is possible in environments with decentralized data, offering a potential step towards modernizing surveillance systems in preparation for current and future public health threats.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問