Abstract
As newly sequenced proteins are deposited into the
world's ever-growing archive of protein sequences, they
are typically immediately tested by various algorithms
for clues as to their biological structure and
function. One question about a new protein involves its
cellular location ­p; that is, where the protein
resides in a living organism (extracellular, membrane,
etc.). A human-created five-way algorithm for cellular
location using statistical techniques with 76% accuracy
was recently reported. This paper describes a two-way
algorithm that was evolved using genetic programming
with 83% accuracy for determining whether a protein is
extracellular and with 89% accuracy for membrane
proteins. Unlike the statistical calculation, the
genetically evolved algorithm employs a large and
varied arsenal of computational capabilities, including
arithmetic functions, conditional operations,
subroutines, iterations, memory, data structures,
set-creating operations, macro definitions, recursion,
etc. The genetically evolved classification algorithm
can be viewed as an extension (which we call a
programmatic motif) of the conventional notion of a
protein motif.
Users
Please
log in to take part in the discussion (add own reviews or comments).