Empirical approximation in Markov games under unbounded payoff: discounted and average criteria
Fernando Luque-Vásquez; J. Adolfo Minjárez-Sosa
Kybernetika (2017)
- Volume: 53, Issue: 4, page 694-716
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topLuque-Vásquez, Fernando, and Minjárez-Sosa, J. Adolfo. "Empirical approximation in Markov games under unbounded payoff: discounted and average criteria." Kybernetika 53.4 (2017): 694-716. <http://eudml.org/doc/294830>.
@article{Luque2017,
abstract = {This work deals with a class of discrete-time zero-sum Markov games whose state process $\left\lbrace x_\{t\}\right\rbrace $ evolves according to the equation $ x_\{t+1\}=F(x_\{t\},a_\{t\},b_\{t\},\xi _\{t\}),$ where $a_\{t\}$ and $b_\{t\}$ represent the actions of player 1 and 2, respectively, and $\left\lbrace \xi _\{t\}\right\rbrace $ is a sequence of independent and identically distributed random variables with unknown distribution $\theta $. Assuming possibly unbounded payoff, and using the empirical distribution to estimate $\theta $, we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.},
author = {Luque-Vásquez, Fernando, Minjárez-Sosa, J. Adolfo},
journal = {Kybernetika},
keywords = {Markov games; empirical estimation; discounted and average criteria},
language = {eng},
number = {4},
pages = {694-716},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Empirical approximation in Markov games under unbounded payoff: discounted and average criteria},
url = {http://eudml.org/doc/294830},
volume = {53},
year = {2017},
}
TY - JOUR
AU - Luque-Vásquez, Fernando
AU - Minjárez-Sosa, J. Adolfo
TI - Empirical approximation in Markov games under unbounded payoff: discounted and average criteria
JO - Kybernetika
PY - 2017
PB - Institute of Information Theory and Automation AS CR
VL - 53
IS - 4
SP - 694
EP - 716
AB - This work deals with a class of discrete-time zero-sum Markov games whose state process $\left\lbrace x_{t}\right\rbrace $ evolves according to the equation $ x_{t+1}=F(x_{t},a_{t},b_{t},\xi _{t}),$ where $a_{t}$ and $b_{t}$ represent the actions of player 1 and 2, respectively, and $\left\lbrace \xi _{t}\right\rbrace $ is a sequence of independent and identically distributed random variables with unknown distribution $\theta $. Assuming possibly unbounded payoff, and using the empirical distribution to estimate $\theta $, we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.
LA - eng
KW - Markov games; empirical estimation; discounted and average criteria
UR - http://eudml.org/doc/294830
ER -
References
top- Chang, H. S., 10.1007/s00186-006-0081-5, Math. Meth. Oper. Res. 64 (2006), 235-351. MR2264789DOI10.1007/s00186-006-0081-5
- Dudley, R. M., 10.1214/aoms/1177697802, Ann. Math. Stat. 40 (1969), 40-50. MR0236977DOI10.1214/aoms/1177697802
- Dynkin, E. B., Yushkevich, A. A., 10.1007/978-1-4615-6746-2, Springer-Verlag, New York 1979. MR0554083DOI10.1007/978-1-4615-6746-2
- Fernández-Gaucherand, E., 10.1016/0096-3003(94)90064-7, Appl. Math. Comp. 64 (1994), 207-212. MR1298262DOI10.1016/0096-3003(94)90064-7
- Filar, J., Vrieze, K., 10.1007/978-1-4612-4054-9, Springer-Verlag, New York 1997. MR1418636DOI10.1007/978-1-4612-4054-9
- Ghosh, M. K., McDonald, D., Sinha, S., 10.1023/b:jota.0000026133.56615.cf, J. Optim. Theory Appl. 121 (2004), 99-118. MR2062972DOI10.1023/b:jota.0000026133.56615.cf
- Gordienko, E. I., 10.1137/1129064, Theory Probab. Appl. 29 (1985), 504-518. MR0761133DOI10.1137/1129064
- Gordienko, E. I., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: existence of canonical policies., Appl. Math. 23 (1995), 199-218. Zbl0829.93067MR1341223
- Gordienko, E. I., Hernández-Lerma, O., Average cost Markov control processes with weighted norms: value iteration., Appl. Math. 23 (1995), 219-237. MR1341224
- Hernández-Lerma, O., Lasserre, J. B., 10.1007/978-1-4612-0729-0, Springer-Verlag, New York 1996. Zbl0840.93001MR1363487DOI10.1007/978-1-4612-0729-0
- Hilgert, N., Minjárez-Sosa, J. A., 10.1007/s00186-005-0024-6, Math. Meth. Oper. Res. 63 (2006), 443-460. MR2264761DOI10.1007/s00186-005-0024-6
- Jaśkiewicz, A., Nowak, A., 10.1137/s0363012904443257, SIAM J. Control Optim. 45 (2006), 773-789. MR2247715DOI10.1137/s0363012904443257
- Jaśkiewicz, A., Nowak, A., 10.1007/s10957-006-9128-2, J. Optim. Theory Appl. 131 (2006), 115-134. MR2278300DOI10.1007/s10957-006-9128-2
- Krausz, A., Rieder, U., 10.1007/bf01217695, Math. Meth. Oper. Res. 46 (1997), 263-279. MR1481935DOI10.1007/bf01217695
- Minjárez-Sosa, J. A., 10.4064/am-26-3-267-280, Appl. Math. (Warsaw) 26 (1999), 267-280. MR1725752DOI10.4064/am-26-3-267-280
- Minjárez-Sosa, J. A., Vega-Amaya, O., 10.1137/060651458, SIAM J. Control Optim. 48 (2009), 1405-1421. MR2496982DOI10.1137/060651458
- Minjárez-Sosa, J. A., Vega-Amaya, O., 10.1016/j.jmaa.2012.12.011, J. Math. Analysis Appl. 402 (2013), 44-56. MR3023236DOI10.1016/j.jmaa.2012.12.011
- Minjárez-Sosa, J. A., Luque-Vásquez, F., 10.1007/s00245-007-9016-7, Appl. Math. Optim. 57 (2008), 289-305. MR2407314DOI10.1007/s00245-007-9016-7
- Neyman, A., Sorin, S., 10.1007/978-94-010-0189-2, Kluwer, 2003. MR2035554DOI10.1007/978-94-010-0189-2
- Prieto-Rumeau, T., Lorenzo, J. M., 10.1007/s11750-014-0354-8, TOP 23 (2015), 799-836. MR3407676DOI10.1007/s11750-014-0354-8
- Shimkin, N., Shwartz, A., 10.1287/moor.20.3.743, Math. Oper. Res. 20 (1995), 743-767. MR1354780DOI10.1287/moor.20.3.743
- Shimkin, N., Shwartz, A., 10.1287/moor.21.2.487, Math. Oper. Res. 21 (1996), 487-512. MR1397226DOI10.1287/moor.21.2.487
- Schäl, M., 10.1007/bf00532612, Z. Wahrs. Verw. Gerb. 32 (1975), 179-196. MR0378841DOI10.1007/bf00532612
- Rao, R. Ranga, 10.1214/aoms/1177704588, Ann. Math. Statist. 33 (1962), 659-680. MR0137809DOI10.1214/aoms/1177704588
- Nunen, J. A. E. E. Van, Wessels, J., 10.1287/mnsc.24.5.576, Manag. Sci. 24 (1978), 576-580. MR0521666DOI10.1287/mnsc.24.5.576
Citations in EuDML Documents
topNotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.