Dynamic learning of patient response types: An application to treating chronic diseases

Diana M. Negoescu; Kostas Bimpikis; Margaret L. Brandeau; Dan A. Iancu

doi:10.1287/mnsc.2017.2793

Dynamic learning of patient response types: An application to treating chronic diseases

Diana M. Negoescu, Kostas Bimpikis, Margaret L. Brandeau, Dan A. Iancu

Industrial and Systems Engineering

Research output: Contribution to journal › Article › peer-review

27 Scopus citations

Abstract

Currently available medication for treating many chronic diseases is often effective only for a subgroup of patients, and biomarkers accurately assessing whether an individual belongs to this subgroup typically do not exist. In such settings, physicians learn about the effectiveness of a drug primarily through experimentation—i.e., by initiating treatment and monitoring the patient’s response. Precise guidelines for discontinuing treatment are often lacking or left entirely to the physician’s discretion. We introduce a framework for developing adaptive, personalized treatments for such chronic diseases. Our model is based on a continuous-time, multi-armed bandit setting where drug effectiveness is assessed by aggregating information from several channels: by continuously monitoring the state of the patient, but also by (not) observing the occurrence of particular infrequent health events, such as relapses or disease flare-ups. Recognizing that the timing and severity of such events provide critical information for treatment decisions is a key point of departure in our framework compared with typical (bandit) models used in healthcare. We show that the model can be analyzed in closed form for several settings of interest, resulting in optimal policies that are intuitive and may have practical appeal. We illustrate the effectiveness of the methodology by developing a set of efficient treatment policies for multiple sclerosis, which we then use to benchmark several existing treatment guidelines.

Original language	English (US)
Pages (from-to)	3469-3488
Number of pages	20
Journal	Management Science
Volume	64
Issue number	8
DOIs	https://doi.org/10.1287/mnsc.2017.2793
State	Published - Aug 2018

Bibliographical note

Publisher Copyright:
© 2017 INFORMS.

Keywords

Adaptive treatment
Continuous time
Dynamic programming
Multiarmed bandits
Optimal control
Stochastic model applications

Access

10.1287/mnsc.2017.2793

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6193506

OpenUrl availability

Full text

Cite this

@article{ba96e30ea8264725b1b7e21d957f5643,

title = "Dynamic learning of patient response types: An application to treating chronic diseases",

abstract = "Currently available medication for treating many chronic diseases is often effective only for a subgroup of patients, and biomarkers accurately assessing whether an individual belongs to this subgroup typically do not exist. In such settings, physicians learn about the effectiveness of a drug primarily through experimentation—i.e., by initiating treatment and monitoring the patient{\textquoteright}s response. Precise guidelines for discontinuing treatment are often lacking or left entirely to the physician{\textquoteright}s discretion. We introduce a framework for developing adaptive, personalized treatments for such chronic diseases. Our model is based on a continuous-time, multi-armed bandit setting where drug effectiveness is assessed by aggregating information from several channels: by continuously monitoring the state of the patient, but also by (not) observing the occurrence of particular infrequent health events, such as relapses or disease flare-ups. Recognizing that the timing and severity of such events provide critical information for treatment decisions is a key point of departure in our framework compared with typical (bandit) models used in healthcare. We show that the model can be analyzed in closed form for several settings of interest, resulting in optimal policies that are intuitive and may have practical appeal. We illustrate the effectiveness of the methodology by developing a set of efficient treatment policies for multiple sclerosis, which we then use to benchmark several existing treatment guidelines.",

keywords = "Adaptive treatment, Continuous time, Dynamic programming, Multiarmed bandits, Optimal control, Stochastic model applications",

author = "Negoescu, {Diana M.} and Kostas Bimpikis and Brandeau, {Margaret L.} and Iancu, {Dan A.}",

note = "Publisher Copyright: {\textcopyright} 2017 INFORMS.",

year = "2018",

month = aug,

doi = "10.1287/mnsc.2017.2793",

language = "English (US)",

volume = "64",

pages = "3469--3488",

journal = "Management Science",

issn = "0025-1909",

publisher = "INFORMS Inst.for Operations Res.and the Management Sciences",

number = "8",

}

TY - JOUR

T1 - Dynamic learning of patient response types

T2 - An application to treating chronic diseases

AU - Negoescu, Diana M.

AU - Bimpikis, Kostas

AU - Brandeau, Margaret L.

AU - Iancu, Dan A.

PY - 2018/8

Y1 - 2018/8

N2 - Currently available medication for treating many chronic diseases is often effective only for a subgroup of patients, and biomarkers accurately assessing whether an individual belongs to this subgroup typically do not exist. In such settings, physicians learn about the effectiveness of a drug primarily through experimentation—i.e., by initiating treatment and monitoring the patient’s response. Precise guidelines for discontinuing treatment are often lacking or left entirely to the physician’s discretion. We introduce a framework for developing adaptive, personalized treatments for such chronic diseases. Our model is based on a continuous-time, multi-armed bandit setting where drug effectiveness is assessed by aggregating information from several channels: by continuously monitoring the state of the patient, but also by (not) observing the occurrence of particular infrequent health events, such as relapses or disease flare-ups. Recognizing that the timing and severity of such events provide critical information for treatment decisions is a key point of departure in our framework compared with typical (bandit) models used in healthcare. We show that the model can be analyzed in closed form for several settings of interest, resulting in optimal policies that are intuitive and may have practical appeal. We illustrate the effectiveness of the methodology by developing a set of efficient treatment policies for multiple sclerosis, which we then use to benchmark several existing treatment guidelines.

AB - Currently available medication for treating many chronic diseases is often effective only for a subgroup of patients, and biomarkers accurately assessing whether an individual belongs to this subgroup typically do not exist. In such settings, physicians learn about the effectiveness of a drug primarily through experimentation—i.e., by initiating treatment and monitoring the patient’s response. Precise guidelines for discontinuing treatment are often lacking or left entirely to the physician’s discretion. We introduce a framework for developing adaptive, personalized treatments for such chronic diseases. Our model is based on a continuous-time, multi-armed bandit setting where drug effectiveness is assessed by aggregating information from several channels: by continuously monitoring the state of the patient, but also by (not) observing the occurrence of particular infrequent health events, such as relapses or disease flare-ups. Recognizing that the timing and severity of such events provide critical information for treatment decisions is a key point of departure in our framework compared with typical (bandit) models used in healthcare. We show that the model can be analyzed in closed form for several settings of interest, resulting in optimal policies that are intuitive and may have practical appeal. We illustrate the effectiveness of the methodology by developing a set of efficient treatment policies for multiple sclerosis, which we then use to benchmark several existing treatment guidelines.

KW - Adaptive treatment

KW - Continuous time

KW - Dynamic programming

KW - Multiarmed bandits

KW - Optimal control

KW - Stochastic model applications

UR - http://www.scopus.com/inward/record.url?scp=85049394189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049394189&partnerID=8YFLogxK

U2 - 10.1287/mnsc.2017.2793

DO - 10.1287/mnsc.2017.2793

M3 - Article

C2 - 30344343

AN - SCOPUS:85049394189

SN - 0025-1909

VL - 64

SP - 3469

EP - 3488

JO - Management Science

JF - Management Science

IS - 8

ER -

Dynamic learning of patient response types: An application to treating chronic diseases

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this