Average cost Markov control processes with weighted norms: value iteration

Evgueni Gordienko; Onésimo Hernández-Lerma

Applicationes Mathematicae (1995)

  • Volume: 23, Issue: 2, page 219-237
  • ISSN: 1233-7234

Abstract

top
This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of forecast horizons.

How to cite

top

Gordienko, Evgueni, and Hernández-Lerma, Onésimo. "Average cost Markov control processes with weighted norms: value iteration." Applicationes Mathematicae 23.2 (1995): 219-237. <http://eudml.org/doc/219127>.

@article{Gordienko1995,
abstract = {This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of forecast horizons.},
author = {Gordienko, Evgueni, Hernández-Lerma, Onésimo},
journal = {Applicationes Mathematicae},
keywords = {average cost optimality equation; strong average optimality; (discrete-time) Markov control processes; long-run average cost; weighted norms; Markov control processes; convergence; value iteration; average cost optimality},
language = {eng},
number = {2},
pages = {219-237},
title = {Average cost Markov control processes with weighted norms: value iteration},
url = {http://eudml.org/doc/219127},
volume = {23},
year = {1995},
}

TY - JOUR
AU - Gordienko, Evgueni
AU - Hernández-Lerma, Onésimo
TI - Average cost Markov control processes with weighted norms: value iteration
JO - Applicationes Mathematicae
PY - 1995
VL - 23
IS - 2
SP - 219
EP - 237
AB - This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of forecast horizons.
LA - eng
KW - average cost optimality equation; strong average optimality; (discrete-time) Markov control processes; long-run average cost; weighted norms; Markov control processes; convergence; value iteration; average cost optimality
UR - http://eudml.org/doc/219127
ER -

References

top
  1. [1] D. P. Bertsekas, Dynamic Programming : Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, N.J., 1987. Zbl0649.93001
  2. [2] E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes, Springer, New York, 1979. Zbl0073.34801
  3. [3] J. Flynn, On optimality criteria for dynamic programs with long finite horizons, J. Math. Anal. Appl. 76 (1980), 202-208. Zbl0438.90100
  4. [4] E. Gordienko and O. Hernández-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies, this volume, 199-218. Zbl0829.93067
  5. [5] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989. 
  6. [6] O. Hernández-Lerma and J. B. Lasserre, A forecast horizon and a stopping rule for general Markov decision processes, J. Math. Anal. Appl. 132 (1988), 388-400. Zbl0646.90090
  7. [7] O. Hernández-Lerma and J. B. Lasserre, Average cost optimal policies for Markov control processes with Borel state space and unbounded costs, Systems Control Lett. 15 (1990), 349-356. Zbl0723.93080
  8. [8] O. Hernández-Lerma and J. B. Lasserre, Linear programming and average optimality of Markov control processes on Borel spaces-unbounded costs, SIAM J. Control Optim. 32 (1994), 480-500. Zbl0799.90120
  9. [9] O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes, book in preparation. Zbl0724.93087
  10. [10] G. P. Klimov, Existence of a final distribution for an irreducible Feller process with invariant measure, Math. Notes 37 (1985), 161-163. Zbl0659.60101
  11. [11] R. Montes-de-Oca and O. Hernández-Lerma, Value iteration in average cost Markov control processes on Borel spaces, Acta Appl. Math., to appear. Zbl0843.93093
  12. [12] E. Nummelin, General Irreducible Markov Chains and Non-Negative Operators, Cambridge University Press, Cambridge, 1984. Zbl0551.60066
  13. [13] R. Rempała, Forecast horizon in a dynamic family of one-dimensional control problems, Dissertationes Math. 315 (1991). Zbl0754.90063
  14. [14] H. L. Royden, Real Analysis, 2nd ed., Macmillan, New York, 1971. Zbl0197.03501
  15. [15] M. Schäl, Conditions for optimality and for the limit of n-stage optimal policies to be optimal, Z. Wahrsch. Verw. Gebiete 32 (1975), 179-196. Zbl0316.90080
  16. [16] L. I. Sennott, Value iteration in countable state average cost Markov decision processes with unbounded costs, Ann. Oper. Res. 28 (1991), 261-272. Zbl0729.90088
  17. [17] D. J. White, Dynamic programming, Markov chains, and the method of successive approximations, J. Math. Anal. Appl. 6 (1963), 373-376. Zbl0124.36404

Citations in EuDML Documents

top
  1. Evgueni Gordienko, Onésimo Hernández-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies
  2. Evgueni I. Gordienko, Francisco Salem-Silva, Estimates of stability of Markov control processes with unbounded costs
  3. Evgueni I. Gordienko, J. Adolfo Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion
  4. Raúl Montes-de-Oca, Francisco Salem-Silva, Estimates for perturbations of average Markov decision processes with a minimal state and upper bounded by stochastically ordered Markov chains
  5. Fernando Luque-Vásquez, J. Adolfo Minjárez-Sosa, Empirical approximation in Markov games under unbounded payoff: discounted and average criteria
  6. Yofre H. García, Saul Diaz-Infante, J. Adolfo Minjárez-Sosa, Partially observable queueing systems with controlled service rates under a discounted optimality criterion
  7. Onésimo Hernández-Lerma, Oscar Vega-Amaya, Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.