TY - JOUR
T1 - Negative binomial sums of random variables and discounted reward processes
AU - Cooper, William L.
PY - 1998/9
Y1 - 1998/9
N2 - Given a sequence of random variables (rewards), the Haviv-Puterman differential equation relates the expected infinite-horizon λ-discounted reward and the expected total reward up to a random time that is determined by an independent negative binomial random variable with parameters 2 and λ. This paper provides an interpretation of this proven, but previously unexplained, result. Furthermore, the interpretation is formalized into a new proof, which then yields new results for the general case where the rewards are accumulated up to a time determined by an independent negative binomial random variable with parameters k and λ.
AB - Given a sequence of random variables (rewards), the Haviv-Puterman differential equation relates the expected infinite-horizon λ-discounted reward and the expected total reward up to a random time that is determined by an independent negative binomial random variable with parameters 2 and λ. This paper provides an interpretation of this proven, but previously unexplained, result. Furthermore, the interpretation is formalized into a new proof, which then yields new results for the general case where the rewards are accumulated up to a time determined by an independent negative binomial random variable with parameters k and λ.
KW - Markov decision processes
KW - Reward processes
KW - Sums of random variables
UR - http://www.scopus.com/inward/record.url?scp=0032258939&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0032258939&partnerID=8YFLogxK
U2 - 10.1017/S0021900200016247
DO - 10.1017/S0021900200016247
M3 - Article
AN - SCOPUS:0032258939
SN - 0021-9002
VL - 35
SP - 589
EP - 599
JO - Journal of Applied Probability
JF - Journal of Applied Probability
IS - 3
ER -