Article,

Correction for hidden confounders in the genetic analysis of gene expression

J. Listgarten, C. Kadie, E. Schadt, and D. Heckerman.
Proceedings of the National Academy of Sciences, 107 (38): 16465--16470 (Sep 21, 2010)
DOI: 10.1073/pnas.1002425107

Abstract

Understanding the genetic underpinnings of disease is important for screening, treatment, drug development, and basic biological insight. One way of getting at such an understanding is to find out which parts of our DNA, such as single-nucleotide polymorphisms, affect particular intermediary processes such as gene expression. Naively, such associations can be identified using a simple statistical test on all paired combinations of genetic variants and gene transcripts. However, a wide variety of confounders lie hidden in the data, leading to both spurious associations and missed associations if not properly addressed. We present a statistical model that jointly corrects for two particular kinds of hidden structure—population structure (e.g., race, family-relatedness), and microarray expression artifacts (e.g., batch effects), when these confounders are unknown. Applying our method to both real and synthetic, human and mouse data, we demonstrate the need for such a joint correction of confounders, and also the disadvantages of other possible approaches based on those in the current literature. In particular, we show that our class of models has maximum power to detect eQTL on synthetic data, and has the best performance on a bronze standard applied to real data. Lastly, our software and the associations we found with it are available at http://www.microsoft.com/science.

BibTeX key: Listgarten2010Correction
entry type: article
year: 2010
month: sep
day: 21
journal: Proceedings of the National Academy of Sciences
number: 38
pages: 16465--16470
publisher: National Academy of Sciences
volume: 107
citeulike-linkout-2: http://www.pnas.org/content/early/2010/08/30/1002425107.full.pdf
citeulike-linkout-1: http://www.pnas.org/content/early/2010/08/30/1002425107.abstract
citeulike-linkout-4: http://view.ncbi.nlm.nih.gov/pubmed/20810919
citeulike-linkout-3: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2944732/
citeulike-linkout-5: http://www.hubmed.org/display.cgi?uids=20810919
citeulike-article-id: 7799935
pmid: 20810919
priority: 2
posted-at: 2010-09-08 10:16:16
issn: 1091-6490
citeulike-linkout-0: http://dx.doi.org/10.1073/pnas.1002425107
pmcid: PMC2944732
DOI: 10.1073/pnas.1002425107
url: http://dx.doi.org/10.1073/pnas.1002425107

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{Listgarten2010Correction, abstract = {Understanding the genetic underpinnings of disease is important for screening, treatment, drug development, and basic biological insight. One way of getting at such an understanding is to find out which parts of our {DNA}, such as single-nucleotide polymorphisms, affect particular intermediary processes such as gene expression. Naively, such associations can be identified using a simple statistical test on all paired combinations of genetic variants and gene transcripts. However, a wide variety of confounders lie hidden in the data, leading to both spurious associations and missed associations if not properly addressed. We present a statistical model that jointly corrects for two particular kinds of hidden structure—population structure (e.g., race, family-relatedness), and microarray expression artifacts (e.g., batch effects), when these confounders are unknown. Applying our method to both real and synthetic, human and mouse data, we demonstrate the need for such a joint correction of confounders, and also the disadvantages of other possible approaches based on those in the current literature. In particular, we show that our class of models has maximum power to detect {eQTL} on synthetic data, and has the best performance on a bronze standard applied to real data. Lastly, our software and the associations we found with it are available at http://www.microsoft.com/science.}, added-at = {2018-12-02T16:09:07.000+0100}, author = {Listgarten, Jennifer and Kadie, Carl and Schadt, Eric E. and Heckerman, David}, biburl = {https://www.bibsonomy.org/bibtex/217dba44bd4bb091d631df6645822eb3d/karthikraman}, citeulike-article-id = {7799935}, citeulike-linkout-0 = {http://dx.doi.org/10.1073/pnas.1002425107}, citeulike-linkout-1 = {http://www.pnas.org/content/early/2010/08/30/1002425107.abstract}, citeulike-linkout-2 = {http://www.pnas.org/content/early/2010/08/30/1002425107.full.pdf}, citeulike-linkout-3 = {http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2944732/}, citeulike-linkout-4 = {http://view.ncbi.nlm.nih.gov/pubmed/20810919}, citeulike-linkout-5 = {http://www.hubmed.org/display.cgi?uids=20810919}, day = 21, doi = {10.1073/pnas.1002425107}, interhash = {7e8f4c168652efe826ef7b9ceabc357f}, intrahash = {17dba44bd4bb091d631df6645822eb3d}, issn = {1091-6490}, journal = {Proceedings of the National Academy of Sciences}, keywords = {genetic-analysis statistical-model}, month = sep, number = 38, pages = {16465--16470}, pmcid = {PMC2944732}, pmid = {20810919}, posted-at = {2010-09-08 10:16:16}, priority = {2}, publisher = {National Academy of Sciences}, timestamp = {2018-12-02T16:09:07.000+0100}, title = {Correction for hidden confounders in the genetic analysis of gene expression}, url = {http://dx.doi.org/10.1073/pnas.1002425107}, volume = 107, year = 2010 }

BibSonomy

Correction for hidden confounders in the genetic analysis of gene expression

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on