First passage risk probability optimality for continuous time Markov decision processes
Kybernetika (2019)
- Volume: 55, Issue: 1, page 114-133
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topHuo, Haifeng, and Wen, Xian. "First passage risk probability optimality for continuous time Markov decision processes." Kybernetika 55.1 (2019): 114-133. <http://eudml.org/doc/294508>.
@article{Huo2019,
abstract = {In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results.},
author = {Huo, Haifeng, Wen, Xian},
journal = {Kybernetika},
keywords = {continuous time Markov decision processes; first passage time; risk probability criterion; optimal policy},
language = {eng},
number = {1},
pages = {114-133},
publisher = {Institute of Information Theory and Automation AS CR},
title = {First passage risk probability optimality for continuous time Markov decision processes},
url = {http://eudml.org/doc/294508},
volume = {55},
year = {2019},
}
TY - JOUR
AU - Huo, Haifeng
AU - Wen, Xian
TI - First passage risk probability optimality for continuous time Markov decision processes
JO - Kybernetika
PY - 2019
PB - Institute of Information Theory and Automation AS CR
VL - 55
IS - 1
SP - 114
EP - 133
AB - In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results.
LA - eng
KW - continuous time Markov decision processes; first passage time; risk probability criterion; optimal policy
UR - http://eudml.org/doc/294508
ER -
References
top- Bertsekas, D., S.Shreve, Stochastic Optimal Control: The Discrete-Time Case., Academic Press Inc 1996 MR0511544
- Bauerle, N., Rieder, U., Markov Decision Processes with Applications to Finance., Springer, Heidelberg 2011 MR2808878
- Feinberg, E., 10.1287/moor.1040.0089, Math. Operat. Res. 29 (2004), 492-524. MR2082616DOI10.1287/moor.1040.0089
- Guo, X. P., Hernández-Lerma, O., Continuous-Time Markov Decision Process: Theorey and Applications., Springer-Verlag, Berlin 2009. MR2554588
- Guo, X. P., Hernández-Del-Valle, A., Hernández-Lerma, O., 10.3166/ejc.18.528-538, Europ. J. Control 18 (2012), 528-538. MR3086896DOI10.3166/ejc.18.528-538
- Guo, X. P., Song, X. Y., Zhang, Y., 10.1109/tac.2013.2281475, IEEE Trans. Automat. Control 59 (2014), 163-174. MR3163332DOI10.1109/tac.2013.2281475
- Guo, X. P., Huang, X. X., Huang, Y. H., 10.1017/s0001867800049016, Adv. Appl. Prob. 47 (2015), 1064-1087. MR3433296DOI10.1017/s0001867800049016
- Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0729-0, Springer-Verlag, New York 1996. MR1363487DOI10.1007/978-1-4612-0729-0
- Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0561-6, Springer-Verlag, New York 1999. MR1697198DOI10.1007/978-1-4612-0561-6
- Huang, Y. H., Guo, X. P., 10.1016/j.jmaa.2009.05.058, J. Math. Anal. Appl. 359 (2009), 404-420. MR2542184DOI10.1016/j.jmaa.2009.05.058
- Huang, Y. H., Guo, X. P., 10.1007/s10255-011-0061-2, Acta. Math. Appl. Sinica 27 (2011), 177-190. MR2784052DOI10.1007/s10255-011-0061-2
- Huang, Y. H., Wei, Q. D., Guo, X. P., 10.1007/s10479-012-1292-1, Ann. Oper. Res. 206 (2013), 197-219. MR3073845DOI10.1007/s10479-012-1292-1
- Huang, Y. H., Guo, X. P., Li, Z. F., 10.1016/j.jmaa.2013.01.021, J. Math. Anal. Appl. 402 (2013), 378-391. MR3023265DOI10.1016/j.jmaa.2013.01.021
- Huang, X. X., Zou, X. L., Guo, X. P., 10.1007/s11425-015-5029-x, Sci. China Math. 58 (2015), 1923-1938. MR3383991DOI10.1007/s11425-015-5029-x
- Huang, X. X., Huang, Y. H., 10.14736/kyb-2017-1-0059, Kybernetika 53 (2017), 59-81. MR3638556DOI10.14736/kyb-2017-1-0059
- Huo, H. F., Zou, X. L., Guo, X. P., 10.1007/s10626-017-0257-6, Discrete Event Dynamic system: Theory Appl. 27 (2017), 675-699. MR3712415DOI10.1007/s10626-017-0257-6
- Janssen, J., Manca, R., Semi-Markov Risk Models For Finance, Insurance, and Reliability., Springer, New York 2006. MR2301626
- Lin, Y. L., Tomkins, R. J., Wang, C. L., 10.1007/bf02006119, Acta. Math. Appl. Sinica 10 (1994), 194-212. MR1289720DOI10.1007/bf02006119
- Liu, J. Y., Liu, K., Markov decision programming - the first passage model with denumerable state space., Systems Sci. Math. Sci. 5 (1992), 340-351. MR1196196
- Liu, J. Y., Huang, S. M., 10.1007/s00245-001-0007-9, Appl. Math. Optim. 43 (2001), 187-201. MR1885696DOI10.1007/s00245-001-0007-9
- Ohtsubo, Y., 10.1016/s0096-3003(03)00158-9, Appl. Math. Anal. Comp. 149 (2004), 519-532. MR2033087DOI10.1016/s0096-3003(03)00158-9
- Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming MR1270015
- Piunovskiy, A., Zhang, Y., 10.1137/10081366x, SIAM J. Control Optim. 49 (2011), 2032-2061. MR2837510DOI10.1137/10081366x
- Schäl, M., 10.1007/s00186-005-0445-2, Math. Meth. Oper. Res. 70 (2005), 141-158. MR2226972DOI10.1007/s00186-005-0445-2
- Wu, C. B., Lin, Y. L., 10.1006/jmaa.1998.6203, J. Math. Anal. Appl. 231 (1999), 47-57. MR1676741DOI10.1006/jmaa.1998.6203
- Wu, X., Guo, X. P., 10.1017/s0021900200012560, J. Appl. Prob. 52 (2015), 441-456. MR3372085DOI10.1017/s0021900200012560
- Yu, S. X., Lin, Y. L., Yan, P. F., 10.1006/jmaa.1998.6015, J. Math. Anal. Appl. 225 (1998), 193-223. MR1639236DOI10.1006/jmaa.1998.6015
- Zou, X. L., Guo, X. P., 10.14736/kyb-2015-2-0276, Kybernetika 51 (2015), 276-292. MR3350562DOI10.14736/kyb-2015-2-0276
Citations in EuDML Documents
top- Haifeng Huo, Xian Wen, Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate
- Haifeng Huo, Jinhua Cui, Xian Wen, Minimizing risk probability for infinite discounted piecewise deterministic Markov decision processes
- Haifeng Huo, Xian Wen, The exponential cost optimality for finite horizon semi-Markov decision processes
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.