sJIVE: Supervised joint and individual variation explained

Elise F. Palzer, Christine H. Wendt, Russell P. Bowler, Craig P. Hersh, Sandra E. Safo, Eric F. Lock

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in molecular biomedical research. Recent methods have sought to uncover underlying structure and relationships within and/or between the data sources, and other methods have sought to build a predictive model for an outcome using all sources. However, existing methods that do both are presently limited because they either (1) only consider data structure shared by all datasets while ignoring structures unique to each source, or (2) they extract underlying structures first without consideration to the outcome. The proposed method, supervised joint and individual variation explained (sJIVE), can simultaneously (1) identify shared (joint) and source-specific (individual) underlying structure and (2) build a linear prediction model for an outcome using these structures. These two components are weighted to compromise between explaining variation in the multi-source data and in the outcome. Simulations show sJIVE to outperform existing methods when large amounts of noise are present in the multi-source data. An application to data from the COPDGene study explores gene expression and proteomic patterns associated with lung function.

Original languageEnglish (US)
Article number107547
JournalComputational Statistics and Data Analysis
Volume175
DOIs
StatePublished - Nov 2022

Bibliographical note

Publisher Copyright:
© 2022 Elsevier B.V.

Keywords

  • Data integration
  • Dimension reduction
  • Genomic data
  • High-dimensional prediction
  • Multi-source data
  • Multi-view learning

Fingerprint

Dive into the research topics of 'sJIVE: Supervised joint and individual variation explained'. Together they form a unique fingerprint.

Cite this