Abstract
The Chromosome-centric Human Proteome Project (C-HPP) seeks to comprehensively characterize all protein products coded by the genome, including those expressed sequence variants confirmed via proteogenomics methods. The closely related Biology/Disease-driven Human Proteome Project (B/D-HPP) seeks to understand the biological and pathological associations of expressed protein products, especially those carrying sequence variants that may be drivers of disease. To achieve these objectives, informatics tools are required that interpret potential functional or disease implications of variant protein sequence detected via proteogenomics. Toward this end, we have developed an automated workflow within the Galaxy for Proteomics (Galaxy-P) platform, which leverages the Cancer-Related Analysis of Variants Toolkit (CRAVAT) and makes it interoperable with proteogenomic results. Protein sequence variants confirmed by proteogenomics are assessed for potential structure-function effects as well as associations with cancer using CRAVAT's rich suite of functionalities, including visualization of results directly within the Galaxy user interface. We demonstrate the effectiveness of this workflow on proteogenomic results generated from an MCF7 breast cancer cell line. Our free and open software should enable improved interpretation of the functional and pathological effects of protein sequence variants detected via proteogenomics, acting as a bridge between the C-HPP and B/D-HPP.
Original language | English (US) |
---|---|
Pages (from-to) | 4329-4336 |
Number of pages | 8 |
Journal | Journal of Proteome Research |
Volume | 17 |
Issue number | 12 |
DOIs | |
State | Published - Dec 7 2018 |
Bibliographical note
Funding Information:We acknowledge funding for this work from the Informatics Technologies for Cancer Research (ITCR) program at the NIH/NCI, from grant U24CA204817 to R.K. and grant U24CA199347 to T.G. We also acknowledge support from the Center for Mass Spectrometry and Proteomics and the University of Minnesota Genome Center for assistance generating demonstration proteogenomic data. We also acknowledge use of the Jetstream cloud-based computing resource for scientific computing (https://jetstream-cloud.org/ ) maintained at Indiana University for assistance in maintaining the publicly available Galaxy instance used for demonstration purposes.
Publisher Copyright:
Copyright © 2018 American Chemical Society.
Keywords
- Biology/Disease-driven Human Proteome Project
- CRAVAT
- Chromosome-centric Human Proteome Project
- Galaxy-P
- bioinformatics
- cancer
- multiomics
- proteogenomics