Article,

PoolHap: Inferring Haplotype Frequencies from Pooled Samples by Next Generation Sequencing

Q. Long, D. Jeffares, Q. Zhang, K. Ye, V. Nizhynska, Z. Ning, C. Tyler-Smith, and M. Nordborg.
PLoS ONE, 6 (1): e15292 (January 2011)
DOI: 10.1371/journal.pone.0015292

Abstract

With the advance of next-generation sequencing (NGS) technologies, increasingly ambitious applications are becoming feasible. A particularly powerful one is the sequencing of polymorphic, pooled samples. The pool can be naturally occurring, as in the case of multiple pathogen strains in a blood sample, multiple types of cells in a cancerous tissue sample, or multiple isoforms of mRNA in a cell. In these cases, it's difficult or impossible to partition the subtypes experimentally before sequencing, and those subtype frequencies must hence be inferred. In addition, investigators may occasionally want to artificially pool the sample of a large number of individuals for reasons of cost-efficiency, e.g., when carrying out genetic mapping using bulked segregant analysis. Here we describe PoolHap, a computational tool for inferring haplotype frequencies from pooled samples when haplotypes are known. The key insight into why PoolHap works is that the large number of SNPs that come with genome-wide coverage can compensate for the uneven coverage across the genome. The performance of PoolHap is illustrated and discussed using simulated and real data. We show that PoolHap is able to accurately estimate the proportions of haplotypes with less than 2% error for 34-strain mixtures with 2X total coverage <italic>Arabidopsis thaliana</italic> whole genome polymorphism data. This method should facilitate greater biological insight into heterogeneous samples that are difficult or impossible to isolate experimentally. Software and users manual are freely available at http://arabidopsis.gmi.oeaw.ac.at/quan/poolhap/.

BibTeX key: long2011poolhap
entry type: article
year: 2011
month: 01
journal: PLoS ONE
number: 1
pages: e15292
publisher: Public Library of Science
volume: 6
DOI: 10.1371/journal.pone.0015292
url: http://dx.doi.org/10.1371%2Fjournal.pone.0015292

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 long2011poolhap %A Long, Quan %A Jeffares, Daniel C. %A Zhang, Qingrun %A Ye, Kai %A Nizhynska, Viktoria %A Ning, Zemin %A Tyler-Smith, Chris %A Nordborg, Magnus %D 2011 %I Public Library of Science %J PLoS ONE %K haplotype_inference methods pooled_samples sequence_data %N 1 %P e15292 %R 10.1371/journal.pone.0015292 %T PoolHap: Inferring Haplotype Frequencies from Pooled Samples by Next Generation Sequencing %U http://dx.doi.org/10.1371%2Fjournal.pone.0015292 %V 6 %X With the advance of next-generation sequencing (NGS) technologies, increasingly ambitious applications are becoming feasible. A particularly powerful one is the sequencing of polymorphic, pooled samples. The pool can be naturally occurring, as in the case of multiple pathogen strains in a blood sample, multiple types of cells in a cancerous tissue sample, or multiple isoforms of mRNA in a cell. In these cases, it's difficult or impossible to partition the subtypes experimentally before sequencing, and those subtype frequencies must hence be inferred. In addition, investigators may occasionally want to artificially pool the sample of a large number of individuals for reasons of cost-efficiency, e.g., when carrying out genetic mapping using bulked segregant analysis. Here we describe PoolHap, a computational tool for inferring haplotype frequencies from pooled samples when haplotypes are known. The key insight into why PoolHap works is that the large number of SNPs that come with genome-wide coverage can compensate for the uneven coverage across the genome. The performance of PoolHap is illustrated and discussed using simulated and real data. We show that PoolHap is able to accurately estimate the proportions of haplotypes with less than 2% error for 34-strain mixtures with 2X total coverage <italic>Arabidopsis thaliana</italic> whole genome polymorphism data. This method should facilitate greater biological insight into heterogeneous samples that are difficult or impossible to isolate experimentally. Software and users manual are freely available at http://arabidopsis.gmi.oeaw.ac.at/quan/poolhap/.

@article{long2011poolhap, abstract = {With the advance of next-generation sequencing (NGS) technologies, increasingly ambitious applications are becoming feasible. A particularly powerful one is the sequencing of polymorphic, pooled samples. The pool can be naturally occurring, as in the case of multiple pathogen strains in a blood sample, multiple types of cells in a cancerous tissue sample, or multiple isoforms of mRNA in a cell. In these cases, it's difficult or impossible to partition the subtypes experimentally before sequencing, and those subtype frequencies must hence be inferred. In addition, investigators may occasionally want to artificially pool the sample of a large number of individuals for reasons of cost-efficiency, e.g., when carrying out genetic mapping using bulked segregant analysis. Here we describe PoolHap, a computational tool for inferring haplotype frequencies from pooled samples when haplotypes are known. The key insight into why PoolHap works is that the large number of SNPs that come with genome-wide coverage can compensate for the uneven coverage across the genome. The performance of PoolHap is illustrated and discussed using simulated and real data. We show that PoolHap is able to accurately estimate the proportions of haplotypes with less than 2% error for 34-strain mixtures with 2X total coverage <italic>Arabidopsis thaliana</italic> whole genome polymorphism data. This method should facilitate greater biological insight into heterogeneous samples that are difficult or impossible to isolate experimentally. Software and users manual are freely available at http://arabidopsis.gmi.oeaw.ac.at/quan/poolhap/. }, added-at = {2012-04-08T20:03:19.000+0200}, author = {Long, Quan and Jeffares, Daniel C. and Zhang, Qingrun and Ye, Kai and Nizhynska, Viktoria and Ning, Zemin and Tyler-Smith, Chris and Nordborg, Magnus}, biburl = {https://www.bibsonomy.org/bibtex/22e050e4be69ab98c7fcec4d81b252521/peter.ralph}, description = {PLoS ONE: PoolHap: Inferring Haplotype Frequencies from Pooled Samples by Next Generation Sequencing}, doi = {10.1371/journal.pone.0015292}, interhash = {ff838d40e645b586dea8ce4f577fe101}, intrahash = {2e050e4be69ab98c7fcec4d81b252521}, journal = {PLoS ONE}, keywords = {haplotype_inference methods pooled_samples sequence_data}, month = {01}, number = 1, pages = {e15292}, publisher = {Public Library of Science}, timestamp = {2012-04-08T20:03:19.000+0200}, title = {PoolHap: Inferring Haplotype Frequencies from Pooled Samples by Next Generation Sequencing}, url = {http://dx.doi.org/10.1371%2Fjournal.pone.0015292}, volume = 6, year = 2011 }

BibSonomy

PoolHap: Inferring Haplotype Frequencies from Pooled Samples by Next Generation Sequencing

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on