Another set of verifiable conditions for average Markov decision processes with Borel spaces

Xiaolong Zou; Xianping Guo

Kybernetika (2015)

  • Volume: 51, Issue: 2, page 276-292
  • ISSN: 0023-5954

Abstract

top
In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.

How to cite

top

Zou, Xiaolong, and Guo, Xianping. "Another set of verifiable conditions for average Markov decision processes with Borel spaces." Kybernetika 51.2 (2015): 276-292. <http://eudml.org/doc/270133>.

@article{Zou2015,
abstract = {In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.},
author = {Zou, Xiaolong, Guo, Xianping},
journal = {Kybernetika},
keywords = {discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function; discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function},
language = {eng},
number = {2},
pages = {276-292},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Another set of verifiable conditions for average Markov decision processes with Borel spaces},
url = {http://eudml.org/doc/270133},
volume = {51},
year = {2015},
}

TY - JOUR
AU - Zou, Xiaolong
AU - Guo, Xianping
TI - Another set of verifiable conditions for average Markov decision processes with Borel spaces
JO - Kybernetika
PY - 2015
PB - Institute of Information Theory and Automation AS CR
VL - 51
IS - 2
SP - 276
EP - 292
AB - In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold.
LA - eng
KW - discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function; discrete-time Markov decision processes; average reward criterion; optimal stationary policy; Lyapunov-type condition; unbounded reward/cost function
UR - http://eudml.org/doc/270133
ER -

References

top
  1. Arapostathis, A., al, et, 10.1137/0331018, SIAM J. Control Optim. 31 (1993), 282-344. MR1205981DOI10.1137/0331018
  2. Casella, G., Berger, R. L., Statistical Inference. Second edition., Duxbury Thomson Learning 2002. 
  3. Dynkin, E. B., Yushkevich, A. A., Controlled Markov Processes., Springer, New York 1979. MR0554083
  4. Gordienko, E., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: existence of canonical policies., Appl. Math. (Warsaw) 23 (1995), 2, 199-218. Zbl0829.93067MR1341223
  5. Guo, X. P., Shi, P., 10.1137/s1052623499355235, SIAM J. Optim. 11 (2001), 4, 1037-1053. Zbl1010.90092MR1855220DOI10.1137/s1052623499355235
  6. Guo, X. P., Zhu, Q. X., 10.1239/jap/1152413725, J. Appl. Probab. 43 (2006), 318-334. Zbl1121.90122MR2248567DOI10.1239/jap/1152413725
  7. Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0729-0, Springer, New York 1996. Zbl0928.93002MR1363487DOI10.1007/978-1-4612-0729-0
  8. Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0561-6, Springer, New York 1999. Zbl0928.93002MR1697198DOI10.1007/978-1-4612-0561-6
  9. Kakumanu, M., 10.1137/0310016, SIAM J. Control Optim. 10 (1972), 1, 210-220. MR0307785DOI10.1137/0310016
  10. Lund, R. B., Tweedie, R. L., 10.1287/moor.21.1.182, Math. Oper. Res. 21 (1996), 1, 182-194. Zbl0847.60053MR1385873DOI10.1287/moor.21.1.182
  11. Meyn, S. P., Tweedie, R. L., 10.1017/cbo9780511626630, Cambridge Univ. Press, New York 2009. Zbl1165.60001MR2509253DOI10.1017/cbo9780511626630
  12. Puterman, M. L., 10.1002/9780470316887, John Wiley, New York 1994. Zbl1184.90170MR1270015DOI10.1002/9780470316887
  13. Sennott, L. I., 10.1007/978-1-4615-0805-2_5, In: Handbook of Markov Decision Processes (Int. Ser. Operat. Res. Manag. Sci. 40) (E. A. Feinberg and A. Shwartz Kluwer, eds.), Boston, pp. 153-172. Zbl1008.90068MR1887202DOI10.1007/978-1-4615-0805-2_5
  14. Sennott, L. I., 10.1002/9780470317037, Wiley, New York 1999. Zbl0997.93503MR1645435DOI10.1002/9780470317037
  15. Zhu, Q. X., 10.1016/j.jmaa.2007.06.071, J. Math. Anal. Appl. 339 (2008), 1, 691-704. MR2370686DOI10.1016/j.jmaa.2007.06.071

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.