TY - JOUR
T1 - Comparison between inverse-probability weighting and multiple imputation in Cox model with missing failure subtype
AU - Guo, Fuyu
AU - Langworthy, Benjamin
AU - Ogino, Shuji
AU - Wang, Molin
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024
Y1 - 2024
N2 - Identifying and distinguishing risk factors for heterogeneous disease subtypes has been of great interest. However, missingness in disease subtypes is a common problem in those data analyses. Several methods have been proposed to deal with the missing data, including complete-case analysis, inverse-probability weighting, and multiple imputation. Although extant literature has compared these methods in missing problems, none has focused on the competing risk setting. In this paper, we discuss the assumptions required when complete-case analysis, inverse-probability weighting, and multiple imputation are used to deal with the missing failure subtype problem, focusing on how to implement these methods under various realistic scenarios in competing risk settings. Besides, we compare these three methods regarding their biases, efficiency, and robustness to model misspecifications using simulation studies. Our results show that complete-case analysis can be seriously biased when the missing completely at random assumption does not hold. Inverse-probability weighting and multiple imputation estimators are valid when we correctly specify the corresponding models for missingness and for imputation, and multiple imputation typically shows higher efficiency than inverse-probability weighting. However, in real-world studies, building imputation models for the missing subtypes can be more challenging than building missingness models. In that case, inverse-probability weighting could be preferred for its easy usage. We also propose two automated model selection procedures and demonstrate their usage in a study of the association between smoking and colorectal cancer subtypes in the Nurses’ Health Study and Health Professional Follow-Up Study.
AB - Identifying and distinguishing risk factors for heterogeneous disease subtypes has been of great interest. However, missingness in disease subtypes is a common problem in those data analyses. Several methods have been proposed to deal with the missing data, including complete-case analysis, inverse-probability weighting, and multiple imputation. Although extant literature has compared these methods in missing problems, none has focused on the competing risk setting. In this paper, we discuss the assumptions required when complete-case analysis, inverse-probability weighting, and multiple imputation are used to deal with the missing failure subtype problem, focusing on how to implement these methods under various realistic scenarios in competing risk settings. Besides, we compare these three methods regarding their biases, efficiency, and robustness to model misspecifications using simulation studies. Our results show that complete-case analysis can be seriously biased when the missing completely at random assumption does not hold. Inverse-probability weighting and multiple imputation estimators are valid when we correctly specify the corresponding models for missingness and for imputation, and multiple imputation typically shows higher efficiency than inverse-probability weighting. However, in real-world studies, building imputation models for the missing subtypes can be more challenging than building missingness models. In that case, inverse-probability weighting could be preferred for its easy usage. We also propose two automated model selection procedures and demonstrate their usage in a study of the association between smoking and colorectal cancer subtypes in the Nurses’ Health Study and Health Professional Follow-Up Study.
KW - Competing risk
KW - complete-case analysis
KW - inverse-probability weighting
KW - missing disease subtype
KW - multiple imputation
UR - http://www.scopus.com/inward/record.url?scp=85183038166&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85183038166&partnerID=8YFLogxK
U2 - 10.1177/09622802231226328
DO - 10.1177/09622802231226328
M3 - Article
C2 - 38262434
AN - SCOPUS:85183038166
SN - 0962-2802
JO - Statistical methods in medical research
JF - Statistical methods in medical research
ER -