This research paper presents an empirical study comparing the performance of various streaming technologies and serialization protocols for scientific data.
Bibliographic Information: Jackson, S., Cummings, N., & Khan, S. (2024). Streaming Technologies and Serialization Protocols: Empirical Performance Analysis. arXiv preprint arXiv:2407.13494v2.
Research Objective: The study aims to guide the selection of optimal streaming and serialization solutions for modern data-intensive applications, particularly in scientific computing.
Methodology: The authors developed an extensible, open-source software framework to benchmark the efficiency of 11 streaming technologies and 13 serialization protocols across 8 different datasets, resulting in over 143 combinations tested. They evaluated 11 performance metrics, including object creation latency, compression ratio, serialization/deserialization throughput, transmission latency, and total throughput.
Key Findings:
Main Conclusions: The study concludes that protocol-based serialization methods, combined with brokerless streaming technologies like gRPC and ZeroMQ, offer the best performance for streaming scientific data. The findings highlight the importance of carefully considering both serialization and streaming technology choices to optimize data transfer efficiency in data-intensive applications.
Significance: This research provides valuable insights for scientists and engineers working with large-scale data, enabling them to make informed decisions when designing and implementing data streaming systems.
Limitations and Future Research: The study was limited to local testing, and future research could investigate performance over wide-area networks. Further exploration of emerging technologies and the impact of data characteristics on performance is also warranted.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Samuel Jacks... klokken arxiv.org 11-05-2024
https://arxiv.org/pdf/2407.13494.pdfDypere Spørsmål