Markov decision processes with time-varying discount factors and random horizon

Rocio Ilhuicatzi-Roldán; Hugo Cruz-Suárez; Selene Chávez-Rodríguez

Kybernetika (2017)

  • Volume: 53, Issue: 1, page 82-98
  • ISSN: 0023-5954

Abstract

top
This paper is related to Markov Decision Processes. The optimal control problem is to minimize the expected total discounted cost, with a non-constant discount factor. The discount factor is time-varying and it could depend on the state and the action. Furthermore, it is considered that the horizon of the optimization problem is given by a discrete random variable, that is, a random horizon is assumed. Under general conditions on Markov control model, using the dynamic programming approach, an optimality equation for both cases is obtained, namely, finite support and infinite support of the random horizon. The obtained results are illustrated by two examples, one of them related to optimal replacement.

How to cite

top

Ilhuicatzi-Roldán, Rocio, Cruz-Suárez, Hugo, and Chávez-Rodríguez, Selene. "Markov decision processes with time-varying discount factors and random horizon." Kybernetika 53.1 (2017): 82-98. <http://eudml.org/doc/287959>.

@article{Ilhuicatzi2017,
abstract = {This paper is related to Markov Decision Processes. The optimal control problem is to minimize the expected total discounted cost, with a non-constant discount factor. The discount factor is time-varying and it could depend on the state and the action. Furthermore, it is considered that the horizon of the optimization problem is given by a discrete random variable, that is, a random horizon is assumed. Under general conditions on Markov control model, using the dynamic programming approach, an optimality equation for both cases is obtained, namely, finite support and infinite support of the random horizon. The obtained results are illustrated by two examples, one of them related to optimal replacement.},
author = {Ilhuicatzi-Roldán, Rocio, Cruz-Suárez, Hugo, Chávez-Rodríguez, Selene},
journal = {Kybernetika},
keywords = {Markov decision process; dynamic programming; varying discount factor; random horizon},
language = {eng},
number = {1},
pages = {82-98},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Markov decision processes with time-varying discount factors and random horizon},
url = {http://eudml.org/doc/287959},
volume = {53},
year = {2017},
}

TY - JOUR
AU - Ilhuicatzi-Roldán, Rocio
AU - Cruz-Suárez, Hugo
AU - Chávez-Rodríguez, Selene
TI - Markov decision processes with time-varying discount factors and random horizon
JO - Kybernetika
PY - 2017
PB - Institute of Information Theory and Automation AS CR
VL - 53
IS - 1
SP - 82
EP - 98
AB - This paper is related to Markov Decision Processes. The optimal control problem is to minimize the expected total discounted cost, with a non-constant discount factor. The discount factor is time-varying and it could depend on the state and the action. Furthermore, it is considered that the horizon of the optimization problem is given by a discrete random variable, that is, a random horizon is assumed. Under general conditions on Markov control model, using the dynamic programming approach, an optimality equation for both cases is obtained, namely, finite support and infinite support of the random horizon. The obtained results are illustrated by two examples, one of them related to optimal replacement.
LA - eng
KW - Markov decision process; dynamic programming; varying discount factor; random horizon
UR - http://eudml.org/doc/287959
ER -

References

top
  1. Carmon, Y., Shwartz, A., 10.1016/j.orl.2008.10.005, Oper. Res. Lett. 37 (2009), 51-55. Zbl1154.90610MR2488083DOI10.1016/j.orl.2008.10.005
  2. Chen, X., Yang, X., 10.1016/j.insmatheco.2015.01.004, Insurance Math. Econom. 61 (2015), 197-205. Zbl1314.91192MR3324056DOI10.1016/j.insmatheco.2015.01.004
  3. Cruz-Suárez, H., Ilhuicatzi-Roldán, R., Montes-de-Oca, R., 10.1007/s10957-012-0262-8, J. Optim. Theory Appl. 162 (2014), 329-346. Zbl1317.90316MR3228530DOI10.1007/s10957-012-0262-8
  4. Vecchia, E. Della, Marco, S. Di, Vidal, F., Dynamic programming for variable discounted Markov decision problems., In: Jornadas Argentinas de Informática e Investigación Operativa (43JAIIO) XII Simposio Argentino de Investigación Operativa (SIO), Buenos Aires 2014, pp. 50-62. 
  5. Feinberg, E., Shwartz, A., 10.1109/9.751365, IEEE Trans. Automat. Control 44 (1999), 628-631. Zbl0957.90127MR1680195DOI10.1109/9.751365
  6. Feinberg, E., Shwartz, A., 10.1287/moor.19.1.152, Math. Oper. Res. 19 (1994), 152-168. Zbl0803.90123MR1290017DOI10.1287/moor.19.1.152
  7. García, Y. H., González-Hernández, J., 10.14736/kyb-2016-3-0403, Kybernetika 52 (2016), 403-426. MR3532514DOI10.14736/kyb-2016-3-0403
  8. González-Hernández, J., López-Martínez, R. R., Minjarez-Sosa, J. A., Adaptive policies for stochastic systems under a randomized discounted criterion., Bol. Soc. Mat. Mex. 14 (2008), 149-163. MR2667162
  9. González-Hernández, J., López-Martínez, R. R., Minjarez-Sosa, J. A., Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion., Kybernetika 45 (2009), 737-754. Zbl1190.93105MR2599109
  10. González-Hernández, J., López-Martínez, R. R., Minjarez-Sosa, J. A., Gabriel-Arguelles, J. A., Constrained Markov control processes with randomized discounted cost criteria: occupation measures and external points., Risk and Decision Analysis 4 (2013), 163-176. 
  11. González-Hernández, J., López-Martínez, R. R., Minjarez-Sosa, J. A., Gabriel-Arguelles, J. A., 10.1002/oca.2089, Optimal Control Appl. Methods 35 (2014), 575-591. MR3262763DOI10.1002/oca.2089
  12. González-Hernández, J., López-Martínez, R. R., Pérez-Hernández, J. R., 10.1007/s00186-006-0092-2, Math. Methods Oper. Res. 65 (2007), 27-44. Zbl1126.90075MR2302022DOI10.1007/s00186-006-0092-2
  13. Guo, X., Hernández-del-Valle, A., Hernández-Lerma, O., 10.3166/ejc.18.528-538, Eur. J. Control 18 (2012), 528-538. Zbl1291.93328MR3086896DOI10.3166/ejc.18.528-538
  14. Hernández-Lerma, O., Laserre, J. B., 10.1007/978-1-4612-0729-0, Springer-Verlag, New York 1996. MR1363487DOI10.1007/978-1-4612-0729-0
  15. Hinderer, K., 10.1007/978-3-642-46229-0, In: Lectures Notes Operations Research (M. Bechmann and H. Künzi, eds.), Springer-Verlag 33, Zürich 1970. Zbl0202.18401MR0267890DOI10.1007/978-3-642-46229-0
  16. Ilhuicatzi-Roldán, R., Cruz-Suárez, H., 10.4067/s0716-09172012000300003, Proyecciones 31 (2012), 219-233. Zbl1262.90050MR2995551DOI10.4067/s0716-09172012000300003
  17. Minjares-Sosa, J. A., 10.1007/s11750-015-0360-5, TOP 23 (2015), 743-772. MR3407674DOI10.1007/s11750-015-0360-5
  18. Puterman, M. L., Markov Decision Process: Discrete Stochastic Dynamic Programming., John Wiley and Sons, New York 1994. MR1270015
  19. Sch{ä}l, M., 10.1007/bf00532612, Probab. Theory Related Fields 32 (1975), 179-196. Zbl0316.90080MR0378841DOI10.1007/bf00532612
  20. Wei, Q., Guo, X., 10.1016/j.orl.2011.06.014, Oper. Res. Lett. 39 (2011), 369-374. MR2835530DOI10.1016/j.orl.2011.06.014
  21. Wei, Q., Guo, X., 10.1007/s10288-014-0267-2, 4OR, 13 (2015), 59-79. Zbl1310.93087MR3323274DOI10.1007/s10288-014-0267-2
  22. Wu, X., Guo, X., 10.1017/s0021900200012560, J. Appl. Probab. 52 (2015), 441-456. MR3372085DOI10.1017/s0021900200012560
  23. Wu, X., Zou, X., Guo, X., 10.1007/s11464-015-0479-6, Front. Math. China 10 (2015), 1005-1023. Zbl1317.90319MR3352898DOI10.1007/s11464-015-0479-6
  24. Wu, X., Zhang, J., 10.1109/wcica.2014.7052984, In: Proc. 11th World Congress on Intelligent Control and Automation 2015, pp. 1745-1748. MR3163332DOI10.1109/wcica.2014.7052984
  25. Wu, X., Zhang, J., 10.1007/s10626-014-0209-3, Discrete Event Dyn. Syst. 26 (2016), 669-683. MR3557415DOI10.1007/s10626-014-0209-3
  26. Ye, L., Guo, X., 10.1007/s10440-012-9669-3, Acta Appl. Math. 121 (2012), 5-27. Zbl1281.90082MR2966962DOI10.1007/s10440-012-9669-3
  27. Zhang, Y., 10.1007/s11750-011-0186-8, TOP 21 (2013), 378-408. Zbl1273.90235MR3068494DOI10.1007/s11750-011-0186-8

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.