Predicting Kidney Transplant Recipient Cohorts’ 30-Day Rehospitalization Using Clinical Notes and Electronic Health Care Record Data

Michael Arenson; Julien Hogan; Liyan Xu; Raymond Lynch; Yi Ting Hana Lee; Jinho D. Choi; Jimeng Sun; Andrew Adams; Rachel E. Patzer

doi:10.1016/j.ekir.2022.12.006

Predicting Kidney Transplant Recipient Cohorts’ 30-Day Rehospitalization Using Clinical Notes and Electronic Health Care Record Data

Michael Arenson, Julien Hogan, Liyan Xu, Raymond Lynch, Yi Ting Hana Lee, Jinho D. Choi, Jimeng Sun, Andrew Adams, Rachel E. Patzer

Surgery

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Introduction: Rehospitalization after kidney transplant is costly to patients and health care systems and is associated with poor outcomes. Few prediction model studies have examined whether inclusion of clinical notes data from the electronic medical record (EMR) enhances prediction of rehospitalization. Methods: In a retrospective, observational study of first-time, adult kidney transplant recipients at a large, urban hospital in southeastern United States (2005−2015), we examined 30-day rehospitalization (30DR) using structured EMR and unstructured (i.e., clinical notes) data. We used natural language processing (NLP) methods on 8 types of clinical notes and included terms in predictive models using unsupervised machine learning approaches. Both the area under the receiver operating curve and precision-recall curve (ROC and PRC, respectively) were used to determine and compare model accuracy, and 5-fold cross-validation tested model performance. Results: Among 2060 kidney transplant recipients, 30.7% were readmitted within 30 days. Predictive models using clinical notes did not meaningfully improve performance over previous models using structured data alone (ROC 0.6821; 95% confidence interval [CI]: 0.6644, 0.6998). Predictive models built using solely clinical notes performed worse than models using both clinical notes and structured data. The data that contributed to the top performing models were not identical but both included structured data and progress notes (ROC 0.6902; 95% CI: 0.6699, 0.7105). Conclusions: Including new features from clinical notes in risk prediction models did not substantially increase predictive accuracy for 30DR for kidney transplant recipients. Future research should consider pooling data from multiple institutions to increase sample size and avoid overfitting models.

Original language	English (US)
Pages (from-to)	489-498
Number of pages	10
Journal	Kidney International Reports
Volume	8
Issue number	3
DOIs	https://doi.org/10.1016/j.ekir.2022.12.006
State	Published - Mar 2023

Bibliographical note

Funding Information:
We would like to acknowledge Bonggun Shin, PhD for his analysis during early stages of this project. This research was funded by the National Institutes on Minority Health and Health Disparities ( R01MD011682 ), and was also supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR002378 and TL1TR002382.

Publisher Copyright:
© 2022

Keywords

early readmission
kidney transplantation
machine learning
natural language processing
predicting readmission
risk prediction

PubMed: MeSH publication types

Journal Article

Access

10.1016/j.ekir.2022.12.006

OpenUrl availability

Full text

Cite this

@article{456179085d0442b19cd30fe0c0c6ac00,

title = "Predicting Kidney Transplant Recipient Cohorts{\textquoteright} 30-Day Rehospitalization Using Clinical Notes and Electronic Health Care Record Data",

abstract = "Introduction: Rehospitalization after kidney transplant is costly to patients and health care systems and is associated with poor outcomes. Few prediction model studies have examined whether inclusion of clinical notes data from the electronic medical record (EMR) enhances prediction of rehospitalization. Methods: In a retrospective, observational study of first-time, adult kidney transplant recipients at a large, urban hospital in southeastern United States (2005−2015), we examined 30-day rehospitalization (30DR) using structured EMR and unstructured (i.e., clinical notes) data. We used natural language processing (NLP) methods on 8 types of clinical notes and included terms in predictive models using unsupervised machine learning approaches. Both the area under the receiver operating curve and precision-recall curve (ROC and PRC, respectively) were used to determine and compare model accuracy, and 5-fold cross-validation tested model performance. Results: Among 2060 kidney transplant recipients, 30.7% were readmitted within 30 days. Predictive models using clinical notes did not meaningfully improve performance over previous models using structured data alone (ROC 0.6821; 95% confidence interval [CI]: 0.6644, 0.6998). Predictive models built using solely clinical notes performed worse than models using both clinical notes and structured data. The data that contributed to the top performing models were not identical but both included structured data and progress notes (ROC 0.6902; 95% CI: 0.6699, 0.7105). Conclusions: Including new features from clinical notes in risk prediction models did not substantially increase predictive accuracy for 30DR for kidney transplant recipients. Future research should consider pooling data from multiple institutions to increase sample size and avoid overfitting models.",

keywords = "early readmission, kidney transplantation, machine learning, natural language processing, predicting readmission, risk prediction",

author = "Michael Arenson and Julien Hogan and Liyan Xu and Raymond Lynch and Lee, {Yi Ting Hana} and Choi, {Jinho D.} and Jimeng Sun and Andrew Adams and Patzer, {Rachel E.}",

note = "Funding Information: We would like to acknowledge Bonggun Shin, PhD for his analysis during early stages of this project. This research was funded by the National Institutes on Minority Health and Health Disparities ( R01MD011682 ), and was also supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR002378 and TL1TR002382. Publisher Copyright: {\textcopyright} 2022",

year = "2023",

month = mar,

doi = "10.1016/j.ekir.2022.12.006",

language = "English (US)",

volume = "8",

pages = "489--498",

journal = "Kidney International Reports",

issn = "2468-0249",

publisher = "Elsevier Inc.",

number = "3",

}

TY - JOUR

T1 - Predicting Kidney Transplant Recipient Cohorts’ 30-Day Rehospitalization Using Clinical Notes and Electronic Health Care Record Data

AU - Arenson, Michael

AU - Hogan, Julien

AU - Xu, Liyan

AU - Lynch, Raymond

AU - Lee, Yi Ting Hana

AU - Choi, Jinho D.

AU - Sun, Jimeng

AU - Adams, Andrew

AU - Patzer, Rachel E.

N1 - Funding Information: We would like to acknowledge Bonggun Shin, PhD for his analysis during early stages of this project. This research was funded by the National Institutes on Minority Health and Health Disparities ( R01MD011682 ), and was also supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR002378 and TL1TR002382. Publisher Copyright: © 2022

PY - 2023/3

Y1 - 2023/3

N2 - Introduction: Rehospitalization after kidney transplant is costly to patients and health care systems and is associated with poor outcomes. Few prediction model studies have examined whether inclusion of clinical notes data from the electronic medical record (EMR) enhances prediction of rehospitalization. Methods: In a retrospective, observational study of first-time, adult kidney transplant recipients at a large, urban hospital in southeastern United States (2005−2015), we examined 30-day rehospitalization (30DR) using structured EMR and unstructured (i.e., clinical notes) data. We used natural language processing (NLP) methods on 8 types of clinical notes and included terms in predictive models using unsupervised machine learning approaches. Both the area under the receiver operating curve and precision-recall curve (ROC and PRC, respectively) were used to determine and compare model accuracy, and 5-fold cross-validation tested model performance. Results: Among 2060 kidney transplant recipients, 30.7% were readmitted within 30 days. Predictive models using clinical notes did not meaningfully improve performance over previous models using structured data alone (ROC 0.6821; 95% confidence interval [CI]: 0.6644, 0.6998). Predictive models built using solely clinical notes performed worse than models using both clinical notes and structured data. The data that contributed to the top performing models were not identical but both included structured data and progress notes (ROC 0.6902; 95% CI: 0.6699, 0.7105). Conclusions: Including new features from clinical notes in risk prediction models did not substantially increase predictive accuracy for 30DR for kidney transplant recipients. Future research should consider pooling data from multiple institutions to increase sample size and avoid overfitting models.

AB - Introduction: Rehospitalization after kidney transplant is costly to patients and health care systems and is associated with poor outcomes. Few prediction model studies have examined whether inclusion of clinical notes data from the electronic medical record (EMR) enhances prediction of rehospitalization. Methods: In a retrospective, observational study of first-time, adult kidney transplant recipients at a large, urban hospital in southeastern United States (2005−2015), we examined 30-day rehospitalization (30DR) using structured EMR and unstructured (i.e., clinical notes) data. We used natural language processing (NLP) methods on 8 types of clinical notes and included terms in predictive models using unsupervised machine learning approaches. Both the area under the receiver operating curve and precision-recall curve (ROC and PRC, respectively) were used to determine and compare model accuracy, and 5-fold cross-validation tested model performance. Results: Among 2060 kidney transplant recipients, 30.7% were readmitted within 30 days. Predictive models using clinical notes did not meaningfully improve performance over previous models using structured data alone (ROC 0.6821; 95% confidence interval [CI]: 0.6644, 0.6998). Predictive models built using solely clinical notes performed worse than models using both clinical notes and structured data. The data that contributed to the top performing models were not identical but both included structured data and progress notes (ROC 0.6902; 95% CI: 0.6699, 0.7105). Conclusions: Including new features from clinical notes in risk prediction models did not substantially increase predictive accuracy for 30DR for kidney transplant recipients. Future research should consider pooling data from multiple institutions to increase sample size and avoid overfitting models.

KW - early readmission

KW - kidney transplantation

KW - machine learning

KW - natural language processing

KW - predicting readmission

KW - risk prediction

UR - http://www.scopus.com/inward/record.url?scp=85146463948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85146463948&partnerID=8YFLogxK

U2 - 10.1016/j.ekir.2022.12.006

DO - 10.1016/j.ekir.2022.12.006

M3 - Article

C2 - 36938078

AN - SCOPUS:85146463948

SN - 2468-0249

VL - 8

SP - 489

EP - 498

JO - Kidney International Reports

JF - Kidney International Reports

IS - 3

ER -

Predicting Kidney Transplant Recipient Cohorts’ 30-Day Rehospitalization Using Clinical Notes and Electronic Health Care Record Data

Abstract

Bibliographical note

Keywords

PubMed: MeSH publication types

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this