Core Concepts
Poseidon is an open framework that provides a standardized data format, software tools, and public archives to enable FAIR handling of archaeogenetic human genotype data.
Abstract
The study of ancient human genomes has accelerated in the last decade, with thousands of new ancient genomes being released each year. However, there is a lack of infrastructure to handle the rich context data (e.g., spatiotemporal provenience) that accompanies ancient samples, as well as a lack of standardized archives for derived genotype data used in most archaeogenetic studies.
To address these issues, the Poseidon framework was developed, which consists of three main components:
A data format (the Poseidon package) to store genotype data together with context information in a structured, human- and machine-readable format.
Software tools, such as trident for data management, xerxes for data analysis, and qjanno for querying context data, that work with the Poseidon package format.
Public archives (the Poseidon Community Archive, Poseidon Minotaur Archive, and Poseidon AADR Archive) that store and maintain Poseidon packages, enabling community-driven data sharing and curation.
The Poseidon framework aims to simplify data storage, acquisition, analysis, and publication in the field of human archaeogenetics, ensuring FAIR data handling and promoting computational reproducibility. The modular design allows for flexible adoption, from using the package format locally to contributing to the public archives.
Stats
"Archaeogenetic samples can only be effectively analysed with context data."
"Recently, the threshold of genome-wide data for 10,000 ancient human individuals has been surpassed."
"Poseidon features public archives with per-article packages that can be downloaded through an open web API."
Quotes
"To make all this new data publicly available, researchers can partly rely on existing infrastructure for the archival and distribution of modern genetic data, such as the Sequence Read Archive (SRA) [11], the European Nucleotide Archive (ENA) [12] or other INSDC databases (https://www.insdc.org). However, this infrastructure has not been prepared to also capture the rich context-data ranging from archaeological field observations to radiocarbon dating that accompanies ancient samples."
"Poseidon emphasises human- and machine-readable data storage, the development of convenient and interoperable command line software, and a high degree of source granularity to elevate the original data publication to the main unit of long-term curation."