TY - JOUR
T1 - PSSV
T2 - A novel pattern-based probabilistic approach for somatic structural variation identification
AU - Chen, Xi
AU - Shi, Xu
AU - Hilakivi-Clarke, Leena
AU - Shajahan-Haq, Ayesha N.
AU - Clarke, Robert
AU - Xuan, Jianhua
N1 - Publisher Copyright:
© The Author 2017. Published by Oxford University Press. All rights reserved.
PY - 2017/1/15
Y1 - 2017/1/15
N2 - Motivation: Whole genome DNA-sequencing (WGS) of paired tumor and normal samples has enabled the identification of somatic DNA changes in an unprecedented detail. Large-scale identification of somatic structural variations (SVs) for a specific cancer type will deepen our understanding of driver mechanisms in cancer progression. However, the limited number of WGS samples, insufficient read coverage, and the impurity of tumor samples that contain normal and neoplastic cells, limit reliable and accurate detection of somatic SVs. Results: We present a novel pattern-based probabilistic approach, PSSV, to identify somatic structural variations from WGS data. PSSV features a mixture model with hidden states representing different mutation patterns; PSSV can thus differentiate heterozygous and homozygous SVs in each sample, enabling the identification of those somatic SVs with heterozygous mutations in normal samples and homozygous mutations in tumor samples. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer data to identify somatic SVs of key factors associated with breast cancer development.
AB - Motivation: Whole genome DNA-sequencing (WGS) of paired tumor and normal samples has enabled the identification of somatic DNA changes in an unprecedented detail. Large-scale identification of somatic structural variations (SVs) for a specific cancer type will deepen our understanding of driver mechanisms in cancer progression. However, the limited number of WGS samples, insufficient read coverage, and the impurity of tumor samples that contain normal and neoplastic cells, limit reliable and accurate detection of somatic SVs. Results: We present a novel pattern-based probabilistic approach, PSSV, to identify somatic structural variations from WGS data. PSSV features a mixture model with hidden states representing different mutation patterns; PSSV can thus differentiate heterozygous and homozygous SVs in each sample, enabling the identification of those somatic SVs with heterozygous mutations in normal samples and homozygous mutations in tumor samples. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer data to identify somatic SVs of key factors associated with breast cancer development.
UR - http://www.scopus.com/inward/record.url?scp=85028326724&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85028326724&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btw605
DO - 10.1093/bioinformatics/btw605
M3 - Article
C2 - 27659451
AN - SCOPUS:85028326724
SN - 1367-4803
VL - 33
SP - 177
EP - 183
JO - Bioinformatics
JF - Bioinformatics
IS - 2
ER -