An unbounded Berge's minimum theorem with applications to discounted Markov decision processes
Raúl Montes-de-Oca; Enrique Lemus-Rodríguez
Kybernetika (2012)
- Volume: 48, Issue: 2, page 268-286
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topMontes-de-Oca, Raúl, and Lemus-Rodríguez, Enrique. "An unbounded Berge's minimum theorem with applications to discounted Markov decision processes." Kybernetika 48.2 (2012): 268-286. <http://eudml.org/doc/247030>.
@article{Montes2012,
abstract = {This paper deals with a certain class of unbounded optimization problems. The optimization problems taken into account depend on a parameter. Firstly, there are established conditions which permit to guarantee the continuity with respect to the parameter of the minimum of the optimization problems under consideration, and the upper semicontinuity of the multifunction which applies each parameter into its set of minimizers. Besides, with the additional condition of uniqueness of the minimizer, its continuity is given. Some examples of nonconvex optimization problems that satisfy the conditions of the article are supplied. Secondly, the theory developed is applied to discounted Markov decision processes with unbounded cost functions and with possibly noncompact actions sets in order to obtain continuous optimal policies. This part of the paper is illustrated with two examples of the controlled Lindley's random walk. One of these examples has nonconstant action sets.},
author = {Montes-de-Oca, Raúl, Lemus-Rodríguez, Enrique},
journal = {Kybernetika},
keywords = {Berge's minimum theorem; moment function; discounted Markov decision process; uniqueness of the optimal policy; continuous optimal policy; Berge's minimum theorem; moment function; discounted Markov decision process; uniqueness of the optimal policy; continuous optimal policy},
language = {eng},
number = {2},
pages = {268-286},
publisher = {Institute of Information Theory and Automation AS CR},
title = {An unbounded Berge's minimum theorem with applications to discounted Markov decision processes},
url = {http://eudml.org/doc/247030},
volume = {48},
year = {2012},
}
TY - JOUR
AU - Montes-de-Oca, Raúl
AU - Lemus-Rodríguez, Enrique
TI - An unbounded Berge's minimum theorem with applications to discounted Markov decision processes
JO - Kybernetika
PY - 2012
PB - Institute of Information Theory and Automation AS CR
VL - 48
IS - 2
SP - 268
EP - 286
AB - This paper deals with a certain class of unbounded optimization problems. The optimization problems taken into account depend on a parameter. Firstly, there are established conditions which permit to guarantee the continuity with respect to the parameter of the minimum of the optimization problems under consideration, and the upper semicontinuity of the multifunction which applies each parameter into its set of minimizers. Besides, with the additional condition of uniqueness of the minimizer, its continuity is given. Some examples of nonconvex optimization problems that satisfy the conditions of the article are supplied. Secondly, the theory developed is applied to discounted Markov decision processes with unbounded cost functions and with possibly noncompact actions sets in order to obtain continuous optimal policies. This part of the paper is illustrated with two examples of the controlled Lindley's random walk. One of these examples has nonconstant action sets.
LA - eng
KW - Berge's minimum theorem; moment function; discounted Markov decision process; uniqueness of the optimal policy; continuous optimal policy; Berge's minimum theorem; moment function; discounted Markov decision process; uniqueness of the optimal policy; continuous optimal policy
UR - http://eudml.org/doc/247030
ER -
References
top- C. D. Aliprantis, K. C. Border, Infinite Dimensional Analysis., Third Edition. Springer-Verlag, Berlin 2006. Zbl1156.46001MR2378491
- R. B. Ash, Real Variables with Basic Metric Space Topology., IEEE Press, New York 1993. Zbl0920.26002MR1193687
- J. P. Aubin, I. Ekeland, Applied Nonlinear Analysis., John Wiley, New York 1984. Zbl1115.47049MR0749753
- L. M. Ausubel, R. J. Deneckere, 10.1007/BF01213694, Econom. Theory 3 (1993), 99-107. Zbl1002.49500MR1211955DOI10.1007/BF01213694
- C. Berge, Topological Spaces., Oliver and Boyd, Edinburgh and London 1963 (reprinted by Dover Publications, Inc., Mineola, New York 1997). Zbl0114.38602MR1464690
- D. Cruz-Suárez, R. Montes-de-Oca, F. Salem-Silva, 10.1007/s001860400372, Math. Methods Oper. Res. 60 (2004), 415-436. Zbl1104.90053MR2106092DOI10.1007/s001860400372
- J. Dugundji, Topology., Allyn and Bacon, Inc., Boston 1966. Zbl0397.54003MR0193606
- P. K. Dutta, M. K.Majumdar, R. K. Sundaram, 10.1016/0165-1889(94)90048-5, J. Econom. Dynam. Control 18 (1994), 1069-1092. Zbl0875.90096MR1298092DOI10.1016/0165-1889(94)90048-5
- P. K. Dutta, T. Mitra, 10.1016/0304-4068(89)90006-2, J. Math. Econom. 18 (1989), 77-86. MR0985949DOI10.1016/0304-4068(89)90006-2
- O. Hernández-Lerma, J. B. Lasserre, Discrete-Time Markov Control Processes: Basic Optimality Criteria., Springer-Verlag, New York 1996. Zbl0840.93001MR1363487
- O. Hernández-Lerma, W. J. Runggaldier, Monotone approximations for convex stochastic control problems., J. Math. Systems Estim. Control 4 (1994), 99-140. Zbl0812.93078MR1298550
- K. Hinderer, Lipschitz continuity of value functions in Markovian decision Processes., Math. Methods Oper. Res. 60 (2005), 3-22. Zbl1093.90075MR2226965
- K. Hinderer, M. Stieglitz, 10.1007/BF01194330, Math. Methods Oper. Res. 44 (1996), 189-204. Zbl0860.90126MR1409065DOI10.1007/BF01194330
- A. Horsley, A. J. Wrobel, T. Van Zandt, 10.1016/S0165-1765(98)00177-3, Econom. Lett. 61 (1998), 285-291. Zbl0913.90079MR1676329DOI10.1016/S0165-1765(98)00177-3
- J. S. Jordan, 10.2307/1912305, Econometrica 45 (1977), 1365-1376. Zbl0363.90035MR0456573DOI10.2307/1912305
- T. Kamihigashi, 10.1016/j.jmateco.2006.05.007, J. Math. Econom. 43 (2007), 477-500. Zbl1154.91032MR2317118DOI10.1016/j.jmateco.2006.05.007
- T. Kamihigashi, S. Roy, 10.1016/j.jet.2005.06.007, J. Econom. Theory 132 (2007), 435-460. Zbl1142.91667MR2285614DOI10.1016/j.jet.2005.06.007
- R. B. King, Beyond Quartic Equation., Birkhauser, Boston 1996. MR1401346
- M. Kitayev, Semi-Markov and jump Markov control models: average cost criterion., Theory Probab. Appl. 30 (1985), 272-288. MR0792619
- D. V. Lindley, The theory of queues with a single server., Proc. Cambridge Philos. Soc. 48 (1952), 277-289. Zbl0046.35501MR0046597
- M. Majumdar, R. Radner, 10.2307/1912118, Econometrica 51 (1983), 1821-1837. MR0720089DOI10.2307/1912118
- S. P. Meyn, 10.1137/0327073, SIAM J. Control Optim. 27 (1989), 1409-1439. MR1022436DOI10.1137/0327073
- E. A. Ok, Real Analysis with Economic Applications., Princeton University Press, Princeton 2007. Zbl1119.26001MR2275400
- A. L. Peressini, F. E. Sullivan, J. J. Uhl, The Mathematics of Nonlinear Programming., Springer-Verlag, New York 1988. Zbl0663.90054MR0932726
- M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming., John Wiley, New York 1994. Zbl1184.90170MR1270015
- U. Rieder, 10.1007/BF01168566, Manuscripta Math. 24 (1978), 115-131. Zbl0385.28005MR0493590DOI10.1007/BF01168566
- H. L. Royden, Real Analysis., Third Edition. Macmillan, New York 1988. Zbl1191.26002MR1013117
- R. H. Stockbridge, Time-average control of martingale problems: a linear programming formulation., Ann. Probab. 18 (1990), 291-314. Zbl0699.49019MR1043944
- R. Sundaram, A First Course in Optimization Theory., Cambridge University Press, Cambridge 1996. Zbl0885.90106MR1402910
- G. Tian, J. Zhou, 10.1016/0022-247X(92)90302-T, J. Math. Anal. Appl. 166 (1992), 351-364. Zbl0761.90110MR1160931DOI10.1016/0022-247X(92)90302-T
- G. Tian, J. Zhou, 10.1016/0304-4068(94)00687-6, J. Math. Econom. 24 (1995), 281-303. MR1320200DOI10.1016/0304-4068(94)00687-6
- M. Walker, 10.2307/2526431, Internat. Econom. Rev. 20 (1979), 267-272. Zbl0406.90001MR0525439DOI10.2307/2526431
- A. Yushkevich, 10.1137/S0363012995292469, SIAM J. Control Optim. 35 (1997), 2157-2182. Zbl0892.93059MR1478659DOI10.1137/S0363012995292469
Citations in EuDML Documents
top- R. Israel Ortega-Gutiérrez, Raúl Montes-de-Oca, Enrique Lemus-Rodríguez, Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach
- Óscar Vega-Amaya, Joaquín López-Borbón, A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.