Abstract
There are estimated to be on the order of 1000000
single nucleotide polymorphisms (SNPs) existing as
standing variation in the human genome. Certain
combinations of these SNPs can interact in complex ways
to predispose individuals for a variety of common
diseases, even though individual SNPs may have no ill
effects. Detecting these epistatic combinations is a
computationally daunting task. Trying to use individual
or growing subsets of SNPs as building blocks for
detection of larger combinations of purely epistatic
SNPs (e.g., via genetic algorithms or genetic
programming) is no better than random search, since
there is no predictive power in subsets of the correct
set of epistatically interacting SNPs. Here, we explore
the potential for hill-climbing from the other
direction; that is, from large sets of candidate SNPs
to smaller ones. This approach was inspired by
Kauffman's "random chemistry" approach to detecting
small autocatalytic sets of molecules from within large
sets. Preliminary results from synthetic data sets show
that the resulting algorithm can detect epistatic pairs
from up to 1000 candidate SNPs in O(log N) fitness
evaluations, although success rate degrades as
heritability declines. The results presented herein are
offered as proof of concept for the random chemistry
approach.
Users
Please
log in to take part in the discussion (add own reviews or comments).