Article,

Space is the Place: Effects of Continuous Spatial Structure on Analysis of Population Genetic Data

C. Battey, P. Ralph, and A. Kern.
bioRxiv, (2019)
DOI: 10.1101/659235

Abstract

Real geography is continuous, but standard models in population genetics are based on discrete, well-mixed populations. As a result many methods of analyzing genetic data assume that samples are a random draw from a well-mixed population, but are applied to clustered samples from populations that are structured clinally over space. Here we use simulations of populations living in continuous geography to study the impacts of dispersal and sampling strategy on population genetic summary statistics, demographic inference, and genome-wide association studies. We find that most common summary statistics have distributions that differ substantially from that seen in well-mixed populations, especially when Wright’s neighborhood size is less than 100 and sampling is spatially clustered. Stepping-stone models reproduce some of these effects, but discretizing the landscape introduces artifacts which in some cases are exacerbated at higher resolutions. The combination of low dispersal and clustered sampling causes demographic inference from the site frequency spectrum to infer more turbulent demographic histories, but averaged results across multiple simulations were surprisingly robust to isolation by distance. We also show that the combination of spatially autocorrelated environments and limited dispersal causes genome-wide association studies to identify spurious signals of genetic association with purely environmentally determined phenotypes, and that this bias is only partially corrected by regressing out principal components of ancestry. Last, we discuss the relevance of our simulation results for inference from genetic variation in real organisms.

BibTeX key: battey2019space
entry type: article
year: 2019
journal: bioRxiv
publisher: Cold Spring Harbor Laboratory
elocation-id: 659235
eprint: https://www.biorxiv.org/content/early/2019/12/06/659235.full.pdf
DOI: 10.1101/659235
url: https://www.biorxiv.org/content/early/2019/12/06/659235

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 battey2019space %A Battey, C.J. %A Ralph, Peter L. %A Kern, Andrew D. %D 2019 %I Cold Spring Harbor Laboratory %J bioRxiv %K geography myown simulation spatial_structure %R 10.1101/659235 %T Space is the Place: Effects of Continuous Spatial Structure on Analysis of Population Genetic Data %U https://www.biorxiv.org/content/early/2019/12/06/659235 %X Real geography is continuous, but standard models in population genetics are based on discrete, well-mixed populations. As a result many methods of analyzing genetic data assume that samples are a random draw from a well-mixed population, but are applied to clustered samples from populations that are structured clinally over space. Here we use simulations of populations living in continuous geography to study the impacts of dispersal and sampling strategy on population genetic summary statistics, demographic inference, and genome-wide association studies. We find that most common summary statistics have distributions that differ substantially from that seen in well-mixed populations, especially when Wright’s neighborhood size is less than 100 and sampling is spatially clustered. Stepping-stone models reproduce some of these effects, but discretizing the landscape introduces artifacts which in some cases are exacerbated at higher resolutions. The combination of low dispersal and clustered sampling causes demographic inference from the site frequency spectrum to infer more turbulent demographic histories, but averaged results across multiple simulations were surprisingly robust to isolation by distance. We also show that the combination of spatially autocorrelated environments and limited dispersal causes genome-wide association studies to identify spurious signals of genetic association with purely environmentally determined phenotypes, and that this bias is only partially corrected by regressing out principal components of ancestry. Last, we discuss the relevance of our simulation results for inference from genetic variation in real organisms.

@article{battey2019space, abstract = {Real geography is continuous, but standard models in population genetics are based on discrete, well-mixed populations. As a result many methods of analyzing genetic data assume that samples are a random draw from a well-mixed population, but are applied to clustered samples from populations that are structured clinally over space. Here we use simulations of populations living in continuous geography to study the impacts of dispersal and sampling strategy on population genetic summary statistics, demographic inference, and genome-wide association studies. We find that most common summary statistics have distributions that differ substantially from that seen in well-mixed populations, especially when Wright{\textquoteright}s neighborhood size is less than 100 and sampling is spatially clustered. Stepping-stone models reproduce some of these effects, but discretizing the landscape introduces artifacts which in some cases are exacerbated at higher resolutions. The combination of low dispersal and clustered sampling causes demographic inference from the site frequency spectrum to infer more turbulent demographic histories, but averaged results across multiple simulations were surprisingly robust to isolation by distance. We also show that the combination of spatially autocorrelated environments and limited dispersal causes genome-wide association studies to identify spurious signals of genetic association with purely environmentally determined phenotypes, and that this bias is only partially corrected by regressing out principal components of ancestry. Last, we discuss the relevance of our simulation results for inference from genetic variation in real organisms.}, added-at = {2020-02-26T04:11:01.000+0100}, author = {Battey, C.J. and Ralph, Peter L. and Kern, Andrew D.}, biburl = {https://www.bibsonomy.org/bibtex/2134e61c084faef380e74abeac84d8aed/peter.ralph}, doi = {10.1101/659235}, elocation-id = {659235}, eprint = {https://www.biorxiv.org/content/early/2019/12/06/659235.full.pdf}, interhash = {2bb702875b977f995738799d6925887d}, intrahash = {134e61c084faef380e74abeac84d8aed}, journal = {bioRxiv}, keywords = {geography myown simulation spatial_structure}, publisher = {Cold Spring Harbor Laboratory}, timestamp = {2020-02-26T04:11:01.000+0100}, title = {Space is the Place: Effects of Continuous Spatial Structure on Analysis of Population Genetic Data}, url = {https://www.biorxiv.org/content/early/2019/12/06/659235}, year = 2019 }

BibSonomy

Space is the Place: Effects of Continuous Spatial Structure on Analysis of Population Genetic Data

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on