A Likelihood-Based Approach for Multivariate Categorical Response Regression in High Dimensions

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the marginal distributions and log odds ratios. To compute our estimator, we propose an efficient algorithm which we extend to settings where some subjects have only one response variable measured, that is, a semi-supervised setting. We derive an asymptotic error bound which illustrates the performance of our estimator in high-dimensional settings. Generalizations to the multivariate categorical response regression model are proposed. Finally, simulation studies and an application in pan-cancer risk prediction demonstrate the usefulness of our method in terms of interpretability and prediction accuracy. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)1402-1414
Number of pages13
JournalJournal of the American Statistical Association
Volume118
Issue number542
DOIs
StatePublished - 2023

Bibliographical note

Publisher Copyright:
© 2021 American Statistical Association.

Keywords

  • Categorical data analysis
  • Classification
  • Convex optimization
  • Multi-label classification
  • Multinomial logistic regression

Fingerprint

Dive into the research topics of 'A Likelihood-Based Approach for Multivariate Categorical Response Regression in High Dimensions'. Together they form a unique fingerprint.

Cite this