Another set of verifiable conditions for average Markov decision processes with Borel spaces
Kybernetika (2015)
- Volume: 51, Issue: 2, page 276-292
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topZou, Xiaolong, and Guo, Xianping. "Another set of verifiable conditions for average Markov decision processes with Borel spaces." Kybernetika 51.2 (2015): 276-292. <http://eudml.org/doc/270133>.
@article{Zou2015,
abstract = {In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.},
author = {Zou, Xiaolong, Guo, Xianping},
journal = {Kybernetika},
keywords = {discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function; discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function},
language = {eng},
number = {2},
pages = {276-292},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Another set of verifiable conditions for average Markov decision processes with Borel spaces},
url = {http://eudml.org/doc/270133},
volume = {51},
year = {2015},
}
TY - JOUR
AU - Zou, Xiaolong
AU - Guo, Xianping
TI - Another set of verifiable conditions for average Markov decision processes with Borel spaces
JO - Kybernetika
PY - 2015
PB - Institute of Information Theory and Automation AS CR
VL - 51
IS - 2
SP - 276
EP - 292
AB - In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.
LA - eng
KW - discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function; discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function
UR - http://eudml.org/doc/270133
ER -
References
top- Arapostathis, A., al, et, 10.1137/0331018, SIAM J. Control Optim. 31 (1993), 282-344. MR1205981DOI10.1137/0331018
- Casella, G., Berger, R. L., Statistical Inference. Second edition., Duxbury Thomson Learning 2002.
- Dynkin, E. B., Yushkevich, A. A., Controlled Markov Processes., Springer, New York 1979. MR0554083
- Gordienko, E., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: existence of canonical policies., Appl. Math. (Warsaw) 23 (1995), 2, 199-218. Zbl0829.93067MR1341223
- Guo, X. P., Shi, P., 10.1137/s1052623499355235, SIAM J. Optim. 11 (2001), 4, 1037-1053. Zbl1010.90092MR1855220DOI10.1137/s1052623499355235
- Guo, X. P., Zhu, Q. X., 10.1239/jap/1152413725, J. Appl. Probab. 43 (2006), 318-334. Zbl1121.90122MR2248567DOI10.1239/jap/1152413725
- Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0729-0, Springer, New York 1996. Zbl0928.93002MR1363487DOI10.1007/978-1-4612-0729-0
- Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0561-6, Springer, New York 1999. Zbl0928.93002MR1697198DOI10.1007/978-1-4612-0561-6
- Kakumanu, M., 10.1137/0310016, SIAM J. Control Optim. 10 (1972), 1, 210-220. MR0307785DOI10.1137/0310016
- Lund, R. B., Tweedie, R. L., 10.1287/moor.21.1.182, Math. Oper. Res. 21 (1996), 1, 182-194. Zbl0847.60053MR1385873DOI10.1287/moor.21.1.182
- Meyn, S. P., Tweedie, R. L., 10.1017/cbo9780511626630, Cambridge Univ. Press, New York 2009. Zbl1165.60001MR2509253DOI10.1017/cbo9780511626630
- Puterman, M. L., 10.1002/9780470316887, John Wiley, New York 1994. Zbl1184.90170MR1270015DOI10.1002/9780470316887
- Sennott, L. I., 10.1007/978-1-4615-0805-2_5, In: Handbook of Markov Decision Processes (Int. Ser. Operat. Res. Manag. Sci. 40) (E. A. Feinberg and A. Shwartz Kluwer, eds.), Boston, pp. 153-172. Zbl1008.90068MR1887202DOI10.1007/978-1-4615-0805-2_5
- Sennott, L. I., 10.1002/9780470317037, Wiley, New York 1999. Zbl0997.93503MR1645435DOI10.1002/9780470317037
- Zhu, Q. X., 10.1016/j.jmaa.2007.06.071, J. Math. Anal. Appl. 339 (2008), 1, 691-704. MR2370686DOI10.1016/j.jmaa.2007.06.071
Citations in EuDML Documents
topNotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.