Sample path average optimality of Markov control processes with strictly unbounded cost

Oscar Vega-Amaya

Sample path average optimality of Markov control processes with strictly unbounded cost

Oscar Vega-Amaya

Applicationes Mathematicae (1999)

Volume: 26, Issue: 4, page 363-381
ISSN: 1233-7234

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

We study the existence of sample path average cost (SPAC-) optimal policies for Markov control processes on Borel spaces with strictly unbounded costs, i.e., costs that grow without bound on the complement of compact subsets. Assuming only that the cost function is lower semicontinuous and that the transition law is weakly continuous, we show the existence of a relaxed policy with 'minimal' expected average cost and that the optimal average cost is the limit of discounted programs. Moreover, we show that if such a policy induces a positive Harris recurrent Markov chain, then it is also sample path average (SPAC-) optimal. We apply our results to inventory systems and, in a particular case, we compute explicitly a deterministic stationary SPAC-optimal policy.

How to cite

top

MLA
BibTeX
RIS

Vega-Amaya, Oscar. "Sample path average optimality of Markov control processes with strictly unbounded cost." Applicationes Mathematicae 26.4 (1999): 363-381. <http://eudml.org/doc/219246>.

@article{Vega1999,
abstract = {We study the existence of sample path average cost (SPAC-) optimal policies for Markov control processes on Borel spaces with strictly unbounded costs, i.e., costs that grow without bound on the complement of compact subsets. Assuming only that the cost function is lower semicontinuous and that the transition law is weakly continuous, we show the existence of a relaxed policy with 'minimal' expected average cost and that the optimal average cost is the limit of discounted programs. Moreover, we show that if such a policy induces a positive Harris recurrent Markov chain, then it is also sample path average (SPAC-) optimal. We apply our results to inventory systems and, in a particular case, we compute explicitly a deterministic stationary SPAC-optimal policy.},
author = {Vega-Amaya, Oscar},
journal = {Applicationes Mathematicae},
keywords = {strictly unbounded costs; sample path average cost criterion; inventory systems; Markov control processes},
language = {eng},
number = {4},
pages = {363-381},
title = {Sample path average optimality of Markov control processes with strictly unbounded cost},
url = {http://eudml.org/doc/219246},
volume = {26},
year = {1999},
}

TY - JOUR
AU - Vega-Amaya, Oscar
TI - Sample path average optimality of Markov control processes with strictly unbounded cost
JO - Applicationes Mathematicae
PY - 1999
VL - 26
IS - 4
SP - 363
EP - 381
AB - We study the existence of sample path average cost (SPAC-) optimal policies for Markov control processes on Borel spaces with strictly unbounded costs, i.e., costs that grow without bound on the complement of compact subsets. Assuming only that the cost function is lower semicontinuous and that the transition law is weakly continuous, we show the existence of a relaxed policy with 'minimal' expected average cost and that the optimal average cost is the limit of discounted programs. Moreover, we show that if such a policy induces a positive Harris recurrent Markov chain, then it is also sample path average (SPAC-) optimal. We apply our results to inventory systems and, in a particular case, we compute explicitly a deterministic stationary SPAC-optimal policy.
LA - eng
KW - strictly unbounded costs; sample path average cost criterion; inventory systems; Markov control processes
UR - http://eudml.org/doc/219246
ER -

References

top

A. Arapostathis et al. (1993), Discrete time controlled Markov processes with an average cost criterion: A survey, SIAM J. Control Optim. 31, 282-344. Zbl0770.93064
D. P. Bertsekas (1987), Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ. Zbl0649.93001
D. P. Bertsekas and S. E. Shreve (1978), Stochastic Optimal Control: The Discrete Time Case, Academic Press, New York. Zbl0471.93002
P. Billingsley (1968), Convergence of Probability Measures, Wiley. Zbl0172.21201
V. S. Borkar (1991), Topics in Controlled Markov Chains, Pitman Res. Notes Math. Ser. 240, Longman Sci. Tech. Zbl0725.93082
R. Cavazos-Cadena and E. Fernández-Gaucherand (1995), Denumerable controlled Markov chains with average reward criterion : sample path optimality, Z. Oper. Res. 41, 89-108. Zbl0835.90116
R. M. Dudley (1989), Real Analysis and Probability, Wadsworth & Brooks. Zbl0686.60001
P. Hall and C. C. Heyde (1980), Martingale Limit Theory and Its Application, Academic Press. Zbl0462.60045
O. Hernández-Lerma (1993), Existence of average optimal policies in Markov control processes with strictly unbounded costs, Kybernetika 29, 1-17. Zbl0792.93120
O. Hernández-Lerma and J. B. Lasserre (1995), Invariant probabilities for Feller-Markov chains, J. Appl. Math. Stochastic Anal. 8, 341-345. Zbl0870.60061
O. Hernández-Lerma and J. B. Lasserre (1996), Discrete-Time Markov Control Processes: Basic Optimality Criteria, Springer, New York. Zbl0840.93001
O. Hernández-Lerma and J. B. Lasserre (1997), Policy iteration for average cost Markov control processes on Borel spaces, Acta Appl. Math., to appear. Zbl0872.93080
O. Hernández-Lerma and M. Muñoz-de-Osak (1992), Discrete-time Markov con- trol processes with discounted unbounded cost: optimality criteria Kybernetika 28, 191-212. Zbl0771.93054
O. Hernández-Lerma, O. Vega-Amaya and G. Carrasco (1998), Sample-path optimality and variance-minimization of average cost Markov control processes, Reporte Interno #236, Departamento de Matemáticas, CINVESTAV-IPN, México City. Zbl0951.93074
K. Hinderer (1970), Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameters, Lecture Notes in Oper. Res. and Math. Systems 33, Springer, Berlin. Zbl0202.18401
J. B. Lasserre (1997), Sample-path average optimality for Markov control processes, Report No. 97102, LAAS-CNRS, Toulouse. Zbl0956.93066
H. L. Lee and S. Nahmias (1993), Single-product, single-location models, in: Logistic of Production and Inventory, S. C. Graves, A. H. G. Rinnooy Kan and P. H. Zipkin (eds.), Handbooks in Operations Research and Management Science, Vol. 4, North-Holland, 3-51.
P. Mandl and M. Lausmanová (1991), Two extensions of asymptotic methods in controlled Markov chains, Ann. Oper. Res. 28, 67-80. Zbl0754.60081
S. P. Meyn (1989), Ergodic theorems for discrete time stochastic systems using a stochastic Lyapunov function, SIAM J. Control Optim. 27, 1409-1439. Zbl0681.60067
S. P. Meyn (1995), The policy iteration algorithm for average reward Markov decision processes with general state space, preprint, Coordinated Science Laboratory, University of Illinois, Urbana, IL.
S. P. Meyn and R. L. Tweedie (1993), Markov Chains and Stochastic Stability, Springer, London. Zbl0925.60001
M. Parlar and R. Rempała (1992), Stochastic inventory problem with piecewise quadratic holding cost function containing a cost-free interval, J. Optim. Theory Appl. 75, 133-153. Zbl0795.90014
O. Vega-Amaya and R. Montes-de-Oca (1998), Application of average dynamic programming to inventory systems, Math. Methods Oper. Res. 47, 451-471. Zbl0940.90007

Citations in EuDML Documents

top

Oscar Vega-Amaya, Fernando Luque-Vásquez, Sample-path average cost optimality for semi-Markov control processes on Borel spaces: unbounded costs and mean holding times

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.