Another set of verifiable conditions for average Markov decision processes with Borel spaces

Xiaolong Zou; Xianping Guo

Another set of verifiable conditions for average Markov decision processes with Borel spaces

Xiaolong Zou; Xianping Guo

Kybernetika (2015)

Volume: 51, Issue: 2, page 276-292
ISSN: 0023-5954

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.

How to cite

top

MLA
BibTeX
RIS

Zou, Xiaolong, and Guo, Xianping. "Another set of verifiable conditions for average Markov decision processes with Borel spaces." Kybernetika 51.2 (2015): 276-292. <http://eudml.org/doc/270133>.

@article{Zou2015,
abstract = {In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.},
author = {Zou, Xiaolong, Guo, Xianping},
journal = {Kybernetika},
keywords = {discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function; discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function},
language = {eng},
number = {2},
pages = {276-292},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Another set of verifiable conditions for average Markov decision processes with Borel spaces},
url = {http://eudml.org/doc/270133},
volume = {51},
year = {2015},
}

TY - JOUR
AU - Zou, Xiaolong
AU - Guo, Xianping
TI - Another set of verifiable conditions for average Markov decision processes with Borel spaces
JO - Kybernetika
PY - 2015
PB - Institute of Information Theory and Automation AS CR
VL - 51
IS - 2
SP - 276
EP - 292
AB - In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.
LA - eng
KW - discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function; discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function
UR - http://eudml.org/doc/270133
ER -

References

top

Arapostathis, A., al, et, 10.1137/0331018, SIAM J. Control Optim. 31 (1993), 282-344. MR1205981 DOI10.1137/0331018
Casella, G., Berger, R. L., Statistical Inference. Second edition., Duxbury Thomson Learning 2002.
Dynkin, E. B., Yushkevich, A. A., Controlled Markov Processes., Springer, New York 1979. MR0554083
Gordienko, E., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: existence of canonical policies., Appl. Math. (Warsaw) 23 (1995), 2, 199-218. Zbl0829.93067 MR1341223
Guo, X. P., Shi, P., 10.1137/s1052623499355235, SIAM J. Optim. 11 (2001), 4, 1037-1053. Zbl1010.90092 MR1855220 DOI10.1137/s1052623499355235
Guo, X. P., Zhu, Q. X., 10.1239/jap/1152413725, J. Appl. Probab. 43 (2006), 318-334. Zbl1121.90122 MR2248567 DOI10.1239/jap/1152413725
Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0729-0, Springer, New York 1996. Zbl0928.93002 MR1363487 DOI10.1007/978-1-4612-0729-0
Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0561-6, Springer, New York 1999. Zbl0928.93002 MR1697198 DOI10.1007/978-1-4612-0561-6
Kakumanu, M., 10.1137/0310016, SIAM J. Control Optim. 10 (1972), 1, 210-220. MR0307785 DOI10.1137/0310016
Lund, R. B., Tweedie, R. L., 10.1287/moor.21.1.182, Math. Oper. Res. 21 (1996), 1, 182-194. Zbl0847.60053 MR1385873 DOI10.1287/moor.21.1.182
Meyn, S. P., Tweedie, R. L., 10.1017/cbo9780511626630, Cambridge Univ. Press, New York 2009. Zbl1165.60001 MR2509253 DOI10.1017/cbo9780511626630
Puterman, M. L., 10.1002/9780470316887, John Wiley, New York 1994. Zbl1184.90170 MR1270015 DOI10.1002/9780470316887
Sennott, L. I., 10.1007/978-1-4615-0805-2_5, In: Handbook of Markov Decision Processes (Int. Ser. Operat. Res. Manag. Sci. 40) (E. A. Feinberg and A. Shwartz Kluwer, eds.), Boston, pp. 153-172. Zbl1008.90068 MR1887202 DOI10.1007/978-1-4615-0805-2_5
Sennott, L. I., 10.1002/9780470317037, Wiley, New York 1999. Zbl0997.93503 MR1645435 DOI10.1002/9780470317037
Zhu, Q. X., 10.1016/j.jmaa.2007.06.071, J. Math. Anal. Appl. 339 (2008), 1, 691-704. MR2370686 DOI10.1016/j.jmaa.2007.06.071

Citations in EuDML Documents

top

Haifeng Huo, Xian Wen, First passage risk probability optimality for continuous time Markov decision processes

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.