The exponential cost optimality for finite horizon semi-Markov decision processes

Haifeng Huo; Xian Wen

The exponential cost optimality for finite horizon semi-Markov decision processes

Haifeng Huo; Xian Wen

Kybernetika (2022)

Volume: 58, Issue: 3, page 301-319
ISSN: 0023-5954

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the

ϵ

-optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm.

How to cite

top

MLA
BibTeX
RIS

Huo, Haifeng, and Wen, Xian. "The exponential cost optimality for finite horizon semi-Markov decision processes." Kybernetika 58.3 (2022): 301-319. <http://eudml.org/doc/298915>.

@article{Huo2022,
abstract = {This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the $\epsilon $-optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm.},
author = {Huo, Haifeng, Wen, Xian},
journal = {Kybernetika},
keywords = {semi-Markov decision processes; exponential cost; finite horizon; optimality equation; optimal policy},
language = {eng},
number = {3},
pages = {301-319},
publisher = {Institute of Information Theory and Automation AS CR},
title = {The exponential cost optimality for finite horizon semi-Markov decision processes},
url = {http://eudml.org/doc/298915},
volume = {58},
year = {2022},
}

TY - JOUR
AU - Huo, Haifeng
AU - Wen, Xian
TI - The exponential cost optimality for finite horizon semi-Markov decision processes
JO - Kybernetika
PY - 2022
PB - Institute of Information Theory and Automation AS CR
VL - 58
IS - 3
SP - 301
EP - 319
AB - This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the $\epsilon $-optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm.
LA - eng
KW - semi-Markov decision processes; exponential cost; finite horizon; optimality equation; optimal policy
UR - http://eudml.org/doc/298915
ER -

References

top

Bertsekas, D. P., Shreve, S. E., Stochastic Optimal Control: The Discrete-Time Case., Academic Press, Inc. 1978. MR0511544
Baüuerle, N., Rieder, U., Markov Decision Processes with Applications to Finance., Springer, Heidelberg 2011 MR2808878
Baüerle, N., Rieder, U., , Math. Oper. Res. 39 (2014), 105-120. MR3173005 DOI
Cao, X. R., , IEEE Trans. Automat. Control 48 (2003), 758-769. MR1980580 DOI
Cavazos-Cadena, R., Montes-De-Oca, R., , Appl. Math. 27 (2000), 167-185. MR1768711 DOI
Cavazos-Cadena, R., Montes-De-Oca, R., , Math. Methl Oper. Res. 52 (2000), 133-167. MR1782381 DOI
Chávez-Rodríguez, S., Cavazos-Cadena, R., Cruz-Suárez, H., , J. Optim. Theory Appl. 170 (2016), 670-686. MR3527716 DOI
Chung, K. J., Sobel, M. J., , SIAM J. Control Optim. 25 (1987), 49-62. MR0872450 DOI
Ghosh, M. K., Saha, S., , Stoch. Int. J. Probab. Stoch. Process. 86 (2014), 655-675. MR3230073 DOI
Guo, X. P., Hernández-Lerma, O., Continuous-Time Markov Decision Process: Theorey and Applications., Springer-Verlag, Berlin 2009. MR2554588
Hernández-Lerma, O., Lasserre, J. B., Discrete-Time Markov control process: Basic Optimality Criteria., Springer-Verlag, New York 1996. MR1363487
Howard, R. A., Matheson, J. E., , Management Sci. 18 (1972), 356-369. MR0292497 DOI
Huang, Y. H., Lian, Z. T., Guo, X. P., , Adv. Appl. Probab. 50 (2018), 783-804. MR3877254 DOI
Huang, Y. H., Guo, X. P., , Europ. J. Oper. Res. 212 (2011), 131-140. MR2783603 DOI
Huang, X. X., Zou, X. L., Guo, X. P., , Sci. China Math. 58 (2015), 1923-1938. MR3383991 DOI
Huo, H. F., Wen, X., , Kybernetika 55 (2019), 114-133. MR3935417 DOI
Jaśkiewicz, A., , Oper. Res. Lett. 36 (2008), 531-534. MR2459494 DOI
Janssen, J., Manca, R., Semi-Markov Risk Models For Finance, Insurance, and Reliability., Springer, New York 2006. MR2301626
Jaśkiewicz, A., , Math. Oper. Res. 29 (2013), 326-338. MR2065981 DOI
Jaquette, S. C., , Manag Sci. 23 (1976), 43-49. MR0439037 DOI
Luque-Vasquez, F., Minjarez-Sosa, J. A., , Math. Methods Oper. Res. 61 (2005), 455-468. MR2225824 DOI
Mamer, J. W., , Oper. Res. 34 (1986), 638-644. MR0874303 DOI
Nollau, V., , Optimization. 39, (1997), 85-97. MR1482757 DOI
Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming MR1270015
Wei, Q., , Math. Oper. Res. 84 (2016), 461-487. MR3591347 DOI
Wu, X., Guo, X. P., , J. Appl. Prob. 52 (2015), 441-456. MR3372085 DOI
Yushkevich, A. A., , Theory Probab. Appl. 26 (1982), 808-815. MR0636774 DOI
Zhang, Y., , SIAM J. Control Optim. 55 (2017), 2636-2666. MR3691210 DOI

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.