Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes

Haifeng Huo; Jinhua Cui; Xian Wen

Kybernetika (2024)

  • Issue: 3, page 357-378
  • ISSN: 0023-5954

Abstract

top
The purpose of this paper is to study the risk probability problem for infinite horizon piecewise deterministic Markov decision processes (PDMDPs) with varying discount factors and unbounded transition rates. Different from the usual expected total rewards, we aim to minimize the risk probability that the total rewards do not exceed a given target value. Under the condition of the controlled state process being non-explosive is slightly weaker than the corresponding ones in the previous literature, we prove the existence and uniqueness of a solution to the optimality equation, and the existence of the risk probability optimal policy by using the value iteration algorithm. Finally, we provide two examples to illustrate our results, one of which explains and verifies our conditions and the other shows the computational results of the value function and the risk probability optimal policy.

How to cite

top

Huo, Haifeng, Cui, Jinhua, and Wen, Xian. "Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes." Kybernetika (2024): 357-378. <http://eudml.org/doc/299292>.

@article{Huo2024,
abstract = {The purpose of this paper is to study the risk probability problem for infinite horizon piecewise deterministic Markov decision processes (PDMDPs) with varying discount factors and unbounded transition rates. Different from the usual expected total rewards, we aim to minimize the risk probability that the total rewards do not exceed a given target value. Under the condition of the controlled state process being non-explosive is slightly weaker than the corresponding ones in the previous literature, we prove the existence and uniqueness of a solution to the optimality equation, and the existence of the risk probability optimal policy by using the value iteration algorithm. Finally, we provide two examples to illustrate our results, one of which explains and verifies our conditions and the other shows the computational results of the value function and the risk probability optimal policy.},
author = {Huo, Haifeng, Cui, Jinhua, Wen, Xian},
journal = {Kybernetika},
keywords = {piecewise deterministic Markov decision processes; risk probability criterion; optimal policy; the value iteration algorithm},
language = {eng},
number = {3},
pages = {357-378},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes},
url = {http://eudml.org/doc/299292},
year = {2024},
}

TY - JOUR
AU - Huo, Haifeng
AU - Cui, Jinhua
AU - Wen, Xian
TI - Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes
JO - Kybernetika
PY - 2024
PB - Institute of Information Theory and Automation AS CR
IS - 3
SP - 357
EP - 378
AB - The purpose of this paper is to study the risk probability problem for infinite horizon piecewise deterministic Markov decision processes (PDMDPs) with varying discount factors and unbounded transition rates. Different from the usual expected total rewards, we aim to minimize the risk probability that the total rewards do not exceed a given target value. Under the condition of the controlled state process being non-explosive is slightly weaker than the corresponding ones in the previous literature, we prove the existence and uniqueness of a solution to the optimality equation, and the existence of the risk probability optimal policy by using the value iteration algorithm. Finally, we provide two examples to illustrate our results, one of which explains and verifies our conditions and the other shows the computational results of the value function and the risk probability optimal policy.
LA - eng
KW - piecewise deterministic Markov decision processes; risk probability criterion; optimal policy; the value iteration algorithm
UR - http://eudml.org/doc/299292
ER -

References

top
  1. Almudevar, A., , SIAM J. Control Optim. 40 (2001), 525-539. MR1857362DOI
  2. Bertsekas, D., Shreve, S., Stochastic Optimal Control: The Discrete-Time Case., Academic Press Inc, New York 1978. MR0511544
  3. Costa, O. L. V., Dufour, F., , J. Appl. Probab. 46 (2009), 1157-1183. MR2582713DOI
  4. Costa, O. L. V., Dufour, F., Continuous Average Control of Piecewise Deterministic Markov Processes., Springer-Vrelag, New York 2013. MR3059228
  5. Bauerle, N., Rieder, U., Markov Decision Processes with Applications to Finance., Springer, Heidelberg 2011. MR2808878
  6. Bertsekas, D., Shreve, S., Stochastic Optimal Control: The Discrete-Time Case., Academic Press Inc, New York 1978. MR0511544
  7. Boda, K., Filar, J. A., Lin, Y. L., , IEEE Trans. Automat. Control.49 (2004), 409-419. MR2062253DOI
  8. Davis, M. H. A., , J. Roy. Statist. Soc. 46 (1984), 353-388. MR0790622DOI
  9. Davis, M. H. A., , Chapman and Hall 1993. MR1283589DOI
  10. Dufou, F., Horiguchi, M., Piunovskiy, A., , Stochastics 88 (2016), 1073-1098. MR3529861DOI
  11. Guo, X. P., Hernández-Lerma, O., Continuous-Time Markov Decision Process: Theorey and Applications., Springer-Verlag, Berlin 2009. MR2554588
  12. Guo, X. P., Piunovskiy, A., , Math. Oper. Res. 36 (2011), 105-132. MR2799395DOI
  13. Guo, X. P., Song, X. Y., Zhang, Y., , IEEE Trans. Automat. Control 59 (2014), 163-174. MR3163332DOI
  14. Hernández-Lerma, O., Lasserre, J. B., Discrete-Time Markov Control Process: Basic Optimality Criteria., Springer-Verlag, New York 1996. MR1363487
  15. Hespanha, J. P., , Nonlinear Anal. 62 (2005), 1353-1383. MR2164929DOI
  16. Huang, Y. H., Guo, X. P., , Stochastics 91 (2019), 67-95. MR3878427DOI
  17. Huang, Y. H., Guo, X. P., Li, Z. F., , J. Math. Anal. Appl. 402 (2013), 378-391. MR3023265DOI
  18. Huang, X. X., Zou, X. L., Guo, X. P., , Sci. China Math. 58 (2015), 1923-1938. MR3383991DOI
  19. Huo, H. F., Wen, X., , Kybernetika 55 (2019), 114-133. MR3935417DOI
  20. Huo, H. F., Zou, X. L., Guo, X. P., , Discrete Event Dynamic system: Theory Appl. 27 (2017), 675-699. MR3712415DOI
  21. Janssen, J., Manca, R., Semi-Markov Risk Models For Finance, Insurance, and Reliability., Springer-Verlag, New York 2006. MR2301626
  22. Lin, Y. L., Tomkins, R. J., Wang, C. L., , Acta. Math. Appl. Sinica 10 (1994) 194-212. MR1289720DOI
  23. Ohtsubo, Y., Toyonaga, K., , J. Math. Anal. Appl. 271 (2002), 66-81. MR1923747DOI
  24. Piunovskiy, A., Zhang, Y., Continuous-Time Markov Decision Processes: Borel Space Models and General Control Strategies., Springer, 2020. MR4180990
  25. Wen, X., Huo, H. F., Guo, X. P., , Acta Math. Appl. Sinica 38 (2022), 549-567. MR4447198DOI
  26. Wu, C. B., Lin, Y. L., , J. Math. Anal. Appl. 231 (1999), 47-57. MR1676741DOI
  27. Wu, X., Guo, X. P., , J. Appl. Prob. 52 (2015), 441-456. MR3372085DOI

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.