Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes
Haifeng Huo; Jinhua Cui; Xian Wen
Kybernetika (2024)
- Issue: 3, page 357-378
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topHuo, Haifeng, Cui, Jinhua, and Wen, Xian. "Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes." Kybernetika (2024): 357-378. <http://eudml.org/doc/299292>.
@article{Huo2024,
abstract = {The purpose of this paper is to study the risk probability problem for infinite horizon piecewise deterministic Markov decision processes (PDMDPs) with varying discount factors and unbounded transition rates. Different from the usual expected total rewards, we aim to minimize the risk probability that the total rewards do not exceed a given target value. Under the condition of the controlled state process being non-explosive is slightly weaker than the corresponding ones in the previous literature, we prove the existence and uniqueness of a solution to the optimality equation, and the existence of the risk probability optimal policy by using the value iteration algorithm. Finally, we provide two examples to illustrate our results, one of which explains and verifies our conditions and the other shows the computational results of the value function and the risk probability optimal policy.},
author = {Huo, Haifeng, Cui, Jinhua, Wen, Xian},
journal = {Kybernetika},
keywords = {piecewise deterministic Markov decision processes; risk probability criterion; optimal policy; the value iteration algorithm},
language = {eng},
number = {3},
pages = {357-378},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes},
url = {http://eudml.org/doc/299292},
year = {2024},
}
TY - JOUR
AU - Huo, Haifeng
AU - Cui, Jinhua
AU - Wen, Xian
TI - Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes
JO - Kybernetika
PY - 2024
PB - Institute of Information Theory and Automation AS CR
IS - 3
SP - 357
EP - 378
AB - The purpose of this paper is to study the risk probability problem for infinite horizon piecewise deterministic Markov decision processes (PDMDPs) with varying discount factors and unbounded transition rates. Different from the usual expected total rewards, we aim to minimize the risk probability that the total rewards do not exceed a given target value. Under the condition of the controlled state process being non-explosive is slightly weaker than the corresponding ones in the previous literature, we prove the existence and uniqueness of a solution to the optimality equation, and the existence of the risk probability optimal policy by using the value iteration algorithm. Finally, we provide two examples to illustrate our results, one of which explains and verifies our conditions and the other shows the computational results of the value function and the risk probability optimal policy.
LA - eng
KW - piecewise deterministic Markov decision processes; risk probability criterion; optimal policy; the value iteration algorithm
UR - http://eudml.org/doc/299292
ER -
References
top- Almudevar, A., , SIAM J. Control Optim. 40 (2001), 525-539. MR1857362DOI
- Bertsekas, D., Shreve, S., Stochastic Optimal Control: The Discrete-Time Case., Academic Press Inc, New York 1978. MR0511544
- Costa, O. L. V., Dufour, F., , J. Appl. Probab. 46 (2009), 1157-1183. MR2582713DOI
- Costa, O. L. V., Dufour, F., Continuous Average Control of Piecewise Deterministic Markov Processes., Springer-Vrelag, New York 2013. MR3059228
- Bauerle, N., Rieder, U., Markov Decision Processes with Applications to Finance., Springer, Heidelberg 2011. MR2808878
- Bertsekas, D., Shreve, S., Stochastic Optimal Control: The Discrete-Time Case., Academic Press Inc, New York 1978. MR0511544
- Boda, K., Filar, J. A., Lin, Y. L., , IEEE Trans. Automat. Control.49 (2004), 409-419. MR2062253DOI
- Davis, M. H. A., , J. Roy. Statist. Soc. 46 (1984), 353-388. MR0790622DOI
- Davis, M. H. A., , Chapman and Hall 1993. MR1283589DOI
- Dufou, F., Horiguchi, M., Piunovskiy, A., , Stochastics 88 (2016), 1073-1098. MR3529861DOI
- Guo, X. P., Hernández-Lerma, O., Continuous-Time Markov Decision Process: Theorey and Applications., Springer-Verlag, Berlin 2009. MR2554588
- Guo, X. P., Piunovskiy, A., , Math. Oper. Res. 36 (2011), 105-132. MR2799395DOI
- Guo, X. P., Song, X. Y., Zhang, Y., , IEEE Trans. Automat. Control 59 (2014), 163-174. MR3163332DOI
- Hernández-Lerma, O., Lasserre, J. B., Discrete-Time Markov Control Process: Basic Optimality Criteria., Springer-Verlag, New York 1996. MR1363487
- Hespanha, J. P., , Nonlinear Anal. 62 (2005), 1353-1383. MR2164929DOI
- Huang, Y. H., Guo, X. P., , Stochastics 91 (2019), 67-95. MR3878427DOI
- Huang, Y. H., Guo, X. P., Li, Z. F., , J. Math. Anal. Appl. 402 (2013), 378-391. MR3023265DOI
- Huang, X. X., Zou, X. L., Guo, X. P., , Sci. China Math. 58 (2015), 1923-1938. MR3383991DOI
- Huo, H. F., Wen, X., , Kybernetika 55 (2019), 114-133. MR3935417DOI
- Huo, H. F., Zou, X. L., Guo, X. P., , Discrete Event Dynamic system: Theory Appl. 27 (2017), 675-699. MR3712415DOI
- Janssen, J., Manca, R., Semi-Markov Risk Models For Finance, Insurance, and Reliability., Springer-Verlag, New York 2006. MR2301626
- Lin, Y. L., Tomkins, R. J., Wang, C. L., , Acta. Math. Appl. Sinica 10 (1994) 194-212. MR1289720DOI
- Ohtsubo, Y., Toyonaga, K., , J. Math. Anal. Appl. 271 (2002), 66-81. MR1923747DOI
- Piunovskiy, A., Zhang, Y., Continuous-Time Markov Decision Processes: Borel Space Models and General Control Strategies., Springer, 2020. MR4180990
- Wen, X., Huo, H. F., Guo, X. P., , Acta Math. Appl. Sinica 38 (2022), 549-567. MR4447198DOI
- Wu, C. B., Lin, Y. L., , J. Math. Anal. Appl. 231 (1999), 47-57. MR1676741DOI
- Wu, X., Guo, X. P., , J. Appl. Prob. 52 (2015), 441-456. MR3372085DOI
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.