Background
Previous studies on bladder cancer have shown nodal
involvement to be an independent indicator of prognosis
and survival. This study aimed at developing an
objective method for detection of nodal metastasis from
molecular profiles of primary urothelial carcinoma
tissues. Methods
The study included primary bladder tumor tissues from
60 patients across different stages and 5 control
tissues of normal urothelium. The entire cohort was
divided into training and validation sets comprised of
node positive and node negative subjects. Quantitative
expression profiling was performed for a panel of 70
genes using standardized competitive RT-PCR and the
expression values of the training set samples were run
through an iterative machine learning process called
genetic programming that employed an N-fold cross
validation technique to generate classifier rules of
limited complexity. These were then used in a voting
algorithm to classify the validation set samples into
those associated with or without nodal metastasis.
Results
The generated classifier rules using 70 genes
demonstrated 81percent accuracy on the validation set
when compared to the pathological nodal status. The
rules showed a strong predilection for ICAM1, MAP2K6
and KDR resulting in gene expression motifs that
cumulatively suggested a pattern ICAM1>MAP2K6>KDR for
node positive cases. Additionally, the motifs showed
CDK8 to be lower relative to ICAM1, and ANXA5 to be
relatively high by itself in node positive tumors.
Rules generated using only ICAM1, MAP2K6 and KDR were
comparably robust, with a single representative rule
producing an accuracy of 90percent when used by itself
on the validation set, suggesting a crucial role for
these genes in nodal metastasis. Conclusion
Our study demonstrates the use of standardized
quantitative gene expression values from primary
bladder tumor tissues as inputs in a genetic
programming system to generate classifier rules for
determining the nodal status. Our method also suggests
the involvement of ICAM1, MAP2K6, KDR, CDK8 and ANXA5
in unique mathematical combinations in the progression
towards nodal positivity. Further studies are needed to
identify more class-specific signatures and confirm the
role of these genes in the evolution of nodal
metastasis in bladder cancer.
Copyright 2006 Mitra et al; licensee BioMed Central
Ltd.
oai
oai:biomedcentral.com:1471-2407-6-159
language
en
notes
p2 'Since scaling the gene expression levels to
represent fold changes relative to a base value could
have biased the significance of these gene'
65 samples. 11-fold cross validation. Max 7-genes per
program.
mixing of folds and majority voting scheme. 100
Generations. p6 Analysis of gene usage 'motifs'
(requires GP, could not be done with other approaches.
Indicate possible biochemical pathways.
p7 'Gene transitivity'. p12 'hypothesis-generating
nature of GP'
p12 'A unique feature of GP is the final output, which
consists of easily readable rules expressed as
executable classifier programs that define tangible
relationships between the most influential genes.' p12
'filtering can create an incomplete and biased dataset
that may not be representative of many complex
biological systems. The curse of
dimensionality'
p13.'hierarchical, KNN, K-means clustering and Neural
Nets which do not scale easily to larger numbers of
variables.'
p13 GP can 'handle missing values in the data'.
%0 Journal Article
%1 oai:biomedcentral.com:1471-2407-6-159
%A Mitra, Anirban P
%A Almal, Arpit A
%A George, Ben
%A Fry, David W
%A Lenehan, Peter F
%A Pagliarulo, Vincenzo
%A Cote, Richard J
%A Datar, Ram H
%A Worzel, William P
%D 2006
%I BioMed Central Ltd.
%J BMC Cancer
%K AUROC algorithms, genetic programming,
%N 159
%T The use of genetic programming in the analysis of
quantitative gene expression profiles for
identification of nodal status in bladder cancer
%U http://www.biomedcentral.com/content/pdf/1471-2407-6-159.pdf
%V 6
%X Background
Previous studies on bladder cancer have shown nodal
involvement to be an independent indicator of prognosis
and survival. This study aimed at developing an
objective method for detection of nodal metastasis from
molecular profiles of primary urothelial carcinoma
tissues. Methods
The study included primary bladder tumor tissues from
60 patients across different stages and 5 control
tissues of normal urothelium. The entire cohort was
divided into training and validation sets comprised of
node positive and node negative subjects. Quantitative
expression profiling was performed for a panel of 70
genes using standardized competitive RT-PCR and the
expression values of the training set samples were run
through an iterative machine learning process called
genetic programming that employed an N-fold cross
validation technique to generate classifier rules of
limited complexity. These were then used in a voting
algorithm to classify the validation set samples into
those associated with or without nodal metastasis.
Results
The generated classifier rules using 70 genes
demonstrated 81percent accuracy on the validation set
when compared to the pathological nodal status. The
rules showed a strong predilection for ICAM1, MAP2K6
and KDR resulting in gene expression motifs that
cumulatively suggested a pattern ICAM1>MAP2K6>KDR for
node positive cases. Additionally, the motifs showed
CDK8 to be lower relative to ICAM1, and ANXA5 to be
relatively high by itself in node positive tumors.
Rules generated using only ICAM1, MAP2K6 and KDR were
comparably robust, with a single representative rule
producing an accuracy of 90percent when used by itself
on the validation set, suggesting a crucial role for
these genes in nodal metastasis. Conclusion
Our study demonstrates the use of standardized
quantitative gene expression values from primary
bladder tumor tissues as inputs in a genetic
programming system to generate classifier rules for
determining the nodal status. Our method also suggests
the involvement of ICAM1, MAP2K6, KDR, CDK8 and ANXA5
in unique mathematical combinations in the progression
towards nodal positivity. Further studies are needed to
identify more class-specific signatures and confirm the
role of these genes in the evolution of nodal
metastasis in bladder cancer.
@article{oai:biomedcentral.com:1471-2407-6-159,
abstract = {Background
Previous studies on bladder cancer have shown nodal
involvement to be an independent indicator of prognosis
and survival. This study aimed at developing an
objective method for detection of nodal metastasis from
molecular profiles of primary urothelial carcinoma
tissues. Methods
The study included primary bladder tumor tissues from
60 patients across different stages and 5 control
tissues of normal urothelium. The entire cohort was
divided into training and validation sets comprised of
node positive and node negative subjects. Quantitative
expression profiling was performed for a panel of 70
genes using standardized competitive RT-PCR and the
expression values of the training set samples were run
through an iterative machine learning process called
genetic programming that employed an N-fold cross
validation technique to generate classifier rules of
limited complexity. These were then used in a voting
algorithm to classify the validation set samples into
those associated with or without nodal metastasis.
Results
The generated classifier rules using 70 genes
demonstrated 81percent accuracy on the validation set
when compared to the pathological nodal status. The
rules showed a strong predilection for ICAM1, MAP2K6
and KDR resulting in gene expression motifs that
cumulatively suggested a pattern ICAM1>MAP2K6>KDR for
node positive cases. Additionally, the motifs showed
CDK8 to be lower relative to ICAM1, and ANXA5 to be
relatively high by itself in node positive tumors.
Rules generated using only ICAM1, MAP2K6 and KDR were
comparably robust, with a single representative rule
producing an accuracy of 90percent when used by itself
on the validation set, suggesting a crucial role for
these genes in nodal metastasis. Conclusion
Our study demonstrates the use of standardized
quantitative gene expression values from primary
bladder tumor tissues as inputs in a genetic
programming system to generate classifier rules for
determining the nodal status. Our method also suggests
the involvement of ICAM1, MAP2K6, KDR, CDK8 and ANXA5
in unique mathematical combinations in the progression
towards nodal positivity. Further studies are needed to
identify more class-specific signatures and confirm the
role of these genes in the evolution of nodal
metastasis in bladder cancer.},
added-at = {2008-06-19T17:35:00.000+0200},
author = {Mitra, Anirban P and Almal, Arpit A and George, Ben and Fry, David W and Lenehan, Peter F and Pagliarulo, Vincenzo and Cote, Richard J and Datar, Ram H and Worzel, William P},
bibsource = {OAI-PMH server at www.biomedcentral.com},
biburl = {https://www.bibsonomy.org/bibtex/2c6b0e756b113f638b4af049aae793b46/brazovayeye},
interhash = {59c35496a3b28214d3e1363bcd6b14c4},
intrahash = {c6b0e756b113f638b4af049aae793b46},
issn = {1471-2407},
journal = {BMC Cancer},
keywords = {AUROC algorithms, genetic programming,},
language = {en},
month = {June~16},
notes = {p2 'Since scaling the gene expression levels to
represent fold changes relative to a base value could
have biased the significance of these gene'
65 samples. 11-fold cross validation. Max 7-genes per
program.
mixing of folds and majority voting scheme. 100
Generations. p6 Analysis of gene usage 'motifs'
(requires GP, could not be done with other approaches.
Indicate possible biochemical pathways.
p7 'Gene transitivity'. p12 'hypothesis-generating
nature of GP'
p12 'A unique feature of GP is the final output, which
consists of easily readable rules expressed as
executable classifier programs that define tangible
relationships between the most influential genes.' p12
'filtering can create an incomplete and biased dataset
that may not be representative of many complex
biological systems. The curse of
dimensionality'
p13.'hierarchical, KNN, K-means clustering and Neural
Nets which do not scale easily to larger numbers of
variables.'
p13 GP can 'handle missing values in the data'.},
number = 159,
oai = {oai:biomedcentral.com:1471-2407-6-159},
publisher = {BioMed Central Ltd.},
rights = {Copyright 2006 Mitra et al; licensee BioMed Central
Ltd.},
timestamp = {2008-06-19T17:47:26.000+0200},
title = {The use of genetic programming in the analysis of
quantitative gene expression profiles for
identification of nodal status in bladder cancer},
url = {http://www.biomedcentral.com/content/pdf/1471-2407-6-159.pdf},
volume = 6,
year = 2006
}