Empirical approximation in Markov games under unbounded payoff: discounted and average criteria

Fernando Luque-Vásquez; J. Adolfo Minjárez-Sosa

Kybernetika (2017)

  • Volume: 53, Issue: 4, page 694-716
  • ISSN: 0023-5954

Abstract

top
This work deals with a class of discrete-time zero-sum Markov games whose state process x t evolves according to the equation x t + 1 = F ( x t , a t , b t , ξ t ) , where a t and b t represent the actions of player 1 and 2, respectively, and ξ t is a sequence of independent and identically distributed random variables with unknown distribution θ . Assuming possibly unbounded payoff, and using the empirical distribution to estimate θ , we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.

How to cite

top

Luque-Vásquez, Fernando, and Minjárez-Sosa, J. Adolfo. "Empirical approximation in Markov games under unbounded payoff: discounted and average criteria." Kybernetika 53.4 (2017): 694-716. <http://eudml.org/doc/294830>.

@article{Luque2017,
abstract = {This work deals with a class of discrete-time zero-sum Markov games whose state process $\left\lbrace x_\{t\}\right\rbrace $ evolves according to the equation $ x_\{t+1\}=F(x_\{t\},a_\{t\},b_\{t\},\xi _\{t\}),$ where $a_\{t\}$ and $b_\{t\}$ represent the actions of player 1 and 2, respectively, and $\left\lbrace \xi _\{t\}\right\rbrace $ is a sequence of independent and identically distributed random variables with unknown distribution $\theta $. Assuming possibly unbounded payoff, and using the empirical distribution to estimate $\theta $, we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.},
author = {Luque-Vásquez, Fernando, Minjárez-Sosa, J. Adolfo},
journal = {Kybernetika},
keywords = {Markov games; empirical estimation; discounted and average criteria},
language = {eng},
number = {4},
pages = {694-716},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Empirical approximation in Markov games under unbounded payoff: discounted and average criteria},
url = {http://eudml.org/doc/294830},
volume = {53},
year = {2017},
}

TY - JOUR
AU - Luque-Vásquez, Fernando
AU - Minjárez-Sosa, J. Adolfo
TI - Empirical approximation in Markov games under unbounded payoff: discounted and average criteria
JO - Kybernetika
PY - 2017
PB - Institute of Information Theory and Automation AS CR
VL - 53
IS - 4
SP - 694
EP - 716
AB - This work deals with a class of discrete-time zero-sum Markov games whose state process $\left\lbrace x_{t}\right\rbrace $ evolves according to the equation $ x_{t+1}=F(x_{t},a_{t},b_{t},\xi _{t}),$ where $a_{t}$ and $b_{t}$ represent the actions of player 1 and 2, respectively, and $\left\lbrace \xi _{t}\right\rbrace $ is a sequence of independent and identically distributed random variables with unknown distribution $\theta $. Assuming possibly unbounded payoff, and using the empirical distribution to estimate $\theta $, we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.
LA - eng
KW - Markov games; empirical estimation; discounted and average criteria
UR - http://eudml.org/doc/294830
ER -

References

top
  1. Chang, H. S., 10.1007/s00186-006-0081-5, Math. Meth. Oper. Res. 64 (2006), 235-351. MR2264789DOI10.1007/s00186-006-0081-5
  2. Dudley, R. M., 10.1214/aoms/1177697802, Ann. Math. Stat. 40 (1969), 40-50. MR0236977DOI10.1214/aoms/1177697802
  3. Dynkin, E. B., Yushkevich, A. A., 10.1007/978-1-4615-6746-2, Springer-Verlag, New York 1979. MR0554083DOI10.1007/978-1-4615-6746-2
  4. Fernández-Gaucherand, E., 10.1016/0096-3003(94)90064-7, Appl. Math. Comp. 64 (1994), 207-212. MR1298262DOI10.1016/0096-3003(94)90064-7
  5. Filar, J., Vrieze, K., 10.1007/978-1-4612-4054-9, Springer-Verlag, New York 1997. MR1418636DOI10.1007/978-1-4612-4054-9
  6. Ghosh, M. K., McDonald, D., Sinha, S., 10.1023/b:jota.0000026133.56615.cf, J. Optim. Theory Appl. 121 (2004), 99-118. MR2062972DOI10.1023/b:jota.0000026133.56615.cf
  7. Gordienko, E. I., 10.1137/1129064, Theory Probab. Appl. 29 (1985), 504-518. MR0761133DOI10.1137/1129064
  8. Gordienko, E. I., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: existence of canonical policies., Appl. Math. 23 (1995), 199-218. Zbl0829.93067MR1341223
  9. Gordienko, E. I., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: value iteration., Appl. Math. 23 (1995), 219-237. MR1341224
  10. Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0729-0, Springer-Verlag, New York 1996. Zbl0840.93001MR1363487DOI10.1007/978-1-4612-0729-0
  11. Hilgert, N., Minjárez-Sosa, J. A., 10.1007/s00186-005-0024-6, Math. Meth. Oper. Res. 63 (2006), 443-460. MR2264761DOI10.1007/s00186-005-0024-6
  12. Jaśkiewicz, A., Nowak, A., 10.1137/s0363012904443257, SIAM J. Control Optim. 45 (2006), 773-789. MR2247715DOI10.1137/s0363012904443257
  13. Jaśkiewicz, A., Nowak, A., 10.1007/s10957-006-9128-2, J. Optim. Theory Appl. 131 (2006), 115-134. MR2278300DOI10.1007/s10957-006-9128-2
  14. Krausz, A., Rieder, U., 10.1007/bf01217695, Math. Meth. Oper. Res. 46 (1997), 263-279. MR1481935DOI10.1007/bf01217695
  15. Minjárez-Sosa, J. A., 10.4064/am-26-3-267-280, Appl. Math. (Warsaw) 26 (1999), 267-280. MR1725752DOI10.4064/am-26-3-267-280
  16. Minjárez-Sosa, J. A., Vega-Amaya, O., 10.1137/060651458, SIAM J. Control Optim. 48 (2009), 1405-1421. MR2496982DOI10.1137/060651458
  17. Minjárez-Sosa, J. A., Vega-Amaya, O., 10.1016/j.jmaa.2012.12.011, J. Math. Analysis Appl. 402 (2013), 44-56. MR3023236DOI10.1016/j.jmaa.2012.12.011
  18. Minjárez-Sosa, J. A., Luque-Vásquez, F., 10.1007/s00245-007-9016-7, Appl. Math. Optim. 57 (2008), 289-305. MR2407314DOI10.1007/s00245-007-9016-7
  19. Neyman, A., Sorin, S., 10.1007/978-94-010-0189-2, Kluwer, 2003. MR2035554DOI10.1007/978-94-010-0189-2
  20. Prieto-Rumeau, T., Lorenzo, J. M., 10.1007/s11750-014-0354-8, TOP 23 (2015), 799-836. MR3407676DOI10.1007/s11750-014-0354-8
  21. Shimkin, N., Shwartz, A., 10.1287/moor.20.3.743, Math. Oper. Res. 20 (1995), 743-767. MR1354780DOI10.1287/moor.20.3.743
  22. Shimkin, N., Shwartz, A., 10.1287/moor.21.2.487, Math. Oper. Res. 21 (1996), 487-512. MR1397226DOI10.1287/moor.21.2.487
  23. Schäl, M., 10.1007/bf00532612, Z. Wahrs. Verw. Gerb. 32 (1975), 179-196. MR0378841DOI10.1007/bf00532612
  24. Rao, R. Ranga, 10.1214/aoms/1177704588, Ann. Math. Statist. 33 (1962), 659-680. MR0137809DOI10.1214/aoms/1177704588
  25. Nunen, J. A. E. E. Van, Wessels, J., 10.1287/mnsc.24.5.576, Manag. Sci. 24 (1978), 576-580. MR0521666DOI10.1287/mnsc.24.5.576

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.