Article,

Eigenanalysis of SNP data with an identity by descent interpretation

, and .
Theoretical Population Biology, (2015)
DOI: http://dx.doi.org/10.1016/j.tpb.2015.09.004

Abstract

Abstract Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of \PCA\ results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of \PCA\ based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found. In addition, a new method of eigenanalysis “EIGMIX” is proposed to estimate individual ancestries. \EIGMIX\ is a method of moments with computational efficiency suitable for millions of \SNP\ data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, \EIGMIX\ is able to infer ancestral proportions (APs) of individuals. The methods were applied to the \SNP\ data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The \APs\ of individuals inferred by \EIGMIX\ are consistent with the findings of the program ADMIXTURE. In conclusion, \EIGMIX\ can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy.

Tags

Users

  • @peter.ralph

Comments and Reviews