Article,

A unified genealogy of modern and ancient genomes

A. Wohns, Y. Wong, B. Jeffery, A. Akbari, S. Mallick, R. Pinhasi, N. Patterson, D. Reich, J. Kelleher, and G. McVean.
bioRxiv, (2021)
DOI: 10.1101/2021.02.16.431497

Abstract

The sequencing of modern and ancient genomes from around the world has revolutionised our understanding of human history and evolution1,2. However, the general problem of how best to characterise the full complexity of ancestral relationships from the totality of human genomic variation remains unsolved. Patterns of variation in each data set are typically analysed independently, and often using parametric models or data reduction techniques that cannot capture the full complexity of human ancestry3,4. Moreover, variation in sequencing technology5,6, data quality7 and in silico processing8,9, coupled with complexities of data scale10, limit the ability to integrate data sources. Here, we introduce a non-parametric approach to inferring human genealogical history that overcomes many of these challenges and enables us to build the largest genealogy of both modern and ancient humans yet constructed. The genealogy provides a lossless and compact representation of multiple datasets, addresses the challenges of missing and erroneous data, and benefits from using ancient samples to constrain and date relationships. Using simulations and empirical analyses, we demonstrate the power of the method to recover relationships between individuals and populations, as well as to identify descendants of ancient samples. Finally, we show how applying a simple non-parametric estimator of ancestor geographical location to the inferred genealogy recapitulates key events in human history. Our results demonstrate that whole-genome genealogies are a powerful means of synthesising genetic data and provide rich insights into human evolution.Competing Interest StatementGM is a director of and shareholder in Genomics plc and a partner in Peptide Groove LLP.

BibTeX key: wohns2021unified
entry type: article
year: 2021
journal: bioRxiv
publisher: Cold Spring Harbor Laboratory
elocation-id: 2021.02.16.431497
eprint: https://www.biorxiv.org/content/early/2021/04/15/2021.02.16.431497.full.pdf
DOI: 10.1101/2021.02.16.431497
url: https://www.biorxiv.org/content/early/2021/04/15/2021.02.16.431497

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 wohns2021unified %A Wohns, Anthony Wilder %A Wong, Yan %A Jeffery, Ben %A Akbari, Ali %A Mallick, Swapan %A Pinhasi, Ron %A Patterson, Nick %A Reich, David %A Kelleher, Jerome %A McVean, Gil %D 2021 %I Cold Spring Harbor Laboratory %J bioRxiv %K methods tree_sequence %R 10.1101/2021.02.16.431497 %T A unified genealogy of modern and ancient genomes %U https://www.biorxiv.org/content/early/2021/04/15/2021.02.16.431497 %X The sequencing of modern and ancient genomes from around the world has revolutionised our understanding of human history and evolution1,2. However, the general problem of how best to characterise the full complexity of ancestral relationships from the totality of human genomic variation remains unsolved. Patterns of variation in each data set are typically analysed independently, and often using parametric models or data reduction techniques that cannot capture the full complexity of human ancestry3,4. Moreover, variation in sequencing technology5,6, data quality7 and in silico processing8,9, coupled with complexities of data scale10, limit the ability to integrate data sources. Here, we introduce a non-parametric approach to inferring human genealogical history that overcomes many of these challenges and enables us to build the largest genealogy of both modern and ancient humans yet constructed. The genealogy provides a lossless and compact representation of multiple datasets, addresses the challenges of missing and erroneous data, and benefits from using ancient samples to constrain and date relationships. Using simulations and empirical analyses, we demonstrate the power of the method to recover relationships between individuals and populations, as well as to identify descendants of ancient samples. Finally, we show how applying a simple non-parametric estimator of ancestor geographical location to the inferred genealogy recapitulates key events in human history. Our results demonstrate that whole-genome genealogies are a powerful means of synthesising genetic data and provide rich insights into human evolution.Competing Interest StatementGM is a director of and shareholder in Genomics plc and a partner in Peptide Groove LLP.

@article{wohns2021unified, abstract = {The sequencing of modern and ancient genomes from around the world has revolutionised our understanding of human history and evolution1,2. However, the general problem of how best to characterise the full complexity of ancestral relationships from the totality of human genomic variation remains unsolved. Patterns of variation in each data set are typically analysed independently, and often using parametric models or data reduction techniques that cannot capture the full complexity of human ancestry3,4. Moreover, variation in sequencing technology5,6, data quality7 and in silico processing8,9, coupled with complexities of data scale10, limit the ability to integrate data sources. Here, we introduce a non-parametric approach to inferring human genealogical history that overcomes many of these challenges and enables us to build the largest genealogy of both modern and ancient humans yet constructed. The genealogy provides a lossless and compact representation of multiple datasets, addresses the challenges of missing and erroneous data, and benefits from using ancient samples to constrain and date relationships. Using simulations and empirical analyses, we demonstrate the power of the method to recover relationships between individuals and populations, as well as to identify descendants of ancient samples. Finally, we show how applying a simple non-parametric estimator of ancestor geographical location to the inferred genealogy recapitulates key events in human history. Our results demonstrate that whole-genome genealogies are a powerful means of synthesising genetic data and provide rich insights into human evolution.Competing Interest StatementGM is a director of and shareholder in Genomics plc and a partner in Peptide Groove LLP.}, added-at = {2021-07-15T07:12:56.000+0200}, author = {Wohns, Anthony Wilder and Wong, Yan and Jeffery, Ben and Akbari, Ali and Mallick, Swapan and Pinhasi, Ron and Patterson, Nick and Reich, David and Kelleher, Jerome and McVean, Gil}, biburl = {https://www.bibsonomy.org/bibtex/25912aa883f2f3856993c4e21fb644e02/peter.ralph}, doi = {10.1101/2021.02.16.431497}, elocation-id = {2021.02.16.431497}, eprint = {https://www.biorxiv.org/content/early/2021/04/15/2021.02.16.431497.full.pdf}, interhash = {9e0916faf1b64c45784e007be278fb85}, intrahash = {5912aa883f2f3856993c4e21fb644e02}, journal = {bioRxiv}, keywords = {methods tree_sequence}, publisher = {Cold Spring Harbor Laboratory}, timestamp = {2021-07-15T07:12:56.000+0200}, title = {A unified genealogy of modern and ancient genomes}, url = {https://www.biorxiv.org/content/early/2021/04/15/2021.02.16.431497}, year = 2021 }

BibSonomy

A unified genealogy of modern and ancient genomes

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on