Abstract
Inference of individual admixture coefficients, which is important for
population genetic and association studies, is commonly performed using
compute-intensive likelihood algorithms. With the availability of large
population genomic data sets, fast versions of likelihood algorithms have
attracted considerable attention. Reducing the computational burden of
estimation algorithms remains, however, a major challenge. Here, we present a
fast and efficient method for estimating individual admixture coefficients
based on sparse non-negative matrix factorization algorithms. We implemented
our method in the computer program sNMF, and applied it to human and plant
genomic data sets. The performances of sNMF were then compared to the
likelihood algorithm implemented in the computer program ADMIXTURE. Without
loss of accuracy, sNMF computed estimates of admixture coefficients within
run-times approximately 10 to 30 times faster than those of ADMIXTURE.
Users
Please
log in to take part in the discussion (add own reviews or comments).