Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko; Francisco Salem-Silva

Kybernetika (2000)

  • Volume: 36, Issue: 2, page [195]-210
  • ISSN: 0023-5954

Abstract

top
For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend on a discount factor β ( 0 , 1 ) and this is given in terms of the total variation distance between p and p ˜ .

How to cite

top

Gordienko, Evgueni I., and Salem-Silva, Francisco. "Estimates of stability of Markov control processes with unbounded costs." Kybernetika 36.2 (2000): [195]-210. <http://eudml.org/doc/33478>.

@article{Gordienko2000,
abstract = {For a discrete-time Markov control process with the transition probability $p$, we compare the total discounted costs $V_\beta $$(\pi _\beta )$ and $V_\beta (\tilde\{\pi \}_\beta )$, when applying the optimal control policy $\pi _\beta $ and its approximation $\tilde\{\pi \}_\beta $. The policy $\tilde\{\pi \}_\beta $ is optimal for an approximating process with the transition probability $\tilde\{p\}$. A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index $[V_\beta (\tilde\{\pi \}_\beta )-V_\beta (\pi _\beta )]/V_\beta (\pi _\beta )$. This bound does not depend on a discount factor $\beta \in (0,1)$ and this is given in terms of the total variation distance between $p$ and $\tilde\{p\}$.},
author = {Gordienko, Evgueni I., Salem-Silva, Francisco},
journal = {Kybernetika},
keywords = {discrete-time Markov control process; unbounded cost; discrete-time Markov control process; unbounded cost},
language = {eng},
number = {2},
pages = {[195]-210},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Estimates of stability of Markov control processes with unbounded costs},
url = {http://eudml.org/doc/33478},
volume = {36},
year = {2000},
}

TY - JOUR
AU - Gordienko, Evgueni I.
AU - Salem-Silva, Francisco
TI - Estimates of stability of Markov control processes with unbounded costs
JO - Kybernetika
PY - 2000
PB - Institute of Information Theory and Automation AS CR
VL - 36
IS - 2
SP - [195]
EP - 210
AB - For a discrete-time Markov control process with the transition probability $p$, we compare the total discounted costs $V_\beta $$(\pi _\beta )$ and $V_\beta (\tilde{\pi }_\beta )$, when applying the optimal control policy $\pi _\beta $ and its approximation $\tilde{\pi }_\beta $. The policy $\tilde{\pi }_\beta $ is optimal for an approximating process with the transition probability $\tilde{p}$. A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index $[V_\beta (\tilde{\pi }_\beta )-V_\beta (\pi _\beta )]/V_\beta (\pi _\beta )$. This bound does not depend on a discount factor $\beta \in (0,1)$ and this is given in terms of the total variation distance between $p$ and $\tilde{p}$.
LA - eng
KW - discrete-time Markov control process; unbounded cost; discrete-time Markov control process; unbounded cost
UR - http://eudml.org/doc/33478
ER -

References

top
  1. Dynkin E. B., Yushkevich A. A., Controlled Markov Processes, Springer–Verlag, New York 1979 MR0554083
  2. Gordienko E., Hernández–Lerma O., Average cost Markov control processes with weighted norms: exitence of canonical policies, Appl. Math. 23 (1995), 199–218 (1995) MR1341223
  3. Gordienko E., Hernández–Lerma O., Average cost Markov control processes with weighted norms: value iteration, Appl. Math. 23 (1995), 219–237 (1995) Zbl0829.93068MR1341224
  4. Gordienko E. I., Isauro-Martínez M. E., Carrillo R. M. Marcos, Estimation of stability in controlled storage systems, Research Report No. 04.0405.I.01.001.97, Dep. de Matemáticas, Universidad Autónoma Metropolitana, México 1997 
  5. Gordienko E. I., Salem F. S., 10.1016/S0167-6911(97)00077-7, Systems Control Lett. 33 (1998), 125–130 (1998) Zbl0902.93068MR1607814DOI10.1016/S0167-6911(97)00077-7
  6. Hernández-Lerma O., Lasserre J. B., 10.1016/0167-6911(90)90108-7, Systems Control Lett. 15 (1990), 349–356 (1990) MR1078813DOI10.1016/0167-6911(90)90108-7
  7. Hernández-Lerma O., Lassere J. B., Discrete–time Markov Control Processes, Springer–Verlag, New York 1995 
  8. Hinderer H., Foundations of Non–Stationary Dynamic Programming with Discrete Time Parameter, (Lecture Notes in Operations Research 33.) Springer–Verlag, New York 1970 Zbl0202.18401MR0267890
  9. Kartashov N. V., 10.1137/1130063, II. Theory Probab. Appl. 30 (1985), 507–515 (1985) DOI10.1137/1130063
  10. Kumar P. R., Varaiya P., Stochastic Systems: Estimation, Identification and Adaptive Control, Prentice–Hall, Englewood Cliffs, N. J. 1986 Zbl0706.93057
  11. Meyn S. P., Tweedie R. L., Markov Chains and Stochastic Stability, Springer–Verlag, Berlin 1993 Zbl1165.60001MR1287609
  12. Nummelin E., General Irreducible Markov Chains and Non–Negative Operators, Cambridge University Press, Cambridge 1984 Zbl0551.60066MR0776608
  13. Rachev S. T., Probability Metrics and the Stability of Stochastic Models, Wiley, New York 1991 Zbl0744.60004MR1105086
  14. Scott D. J., Tweedie R. L., Explicit rates of convergence of stochastically ordered Markov chains, In: Proc. Athens Conference of Applied Probability and Time Series Analysis: Papers in Honour of J. M. Gani and E. J. Hannan (C. C. Heyde, Yu. V. Prohorov, R. Pyke and S. T. Rachev, eds.). Springer–Verlag, New York 1995, pp. 176–191 (1995) MR1466715
  15. Dijk N. M. Van, 10.2307/1427272, Adv. in Appl. Probab. 20 (1988), 99–111 (1988) MR0932536DOI10.2307/1427272
  16. Dijk N. M. Van, Puterman M. L., 10.2307/1427271, Adv. in Appl. Probab. 20 (1988), 79–98 (1988) MR0932535DOI10.2307/1427271
  17. Weber R. R., jr. S. Stidham, 10.2307/1427380, Adv. in Appl. Probab. 19 (1987), 202–218 (1987) Zbl0617.60090MR0876537DOI10.2307/1427380
  18. Whitt W., 10.1287/moor.3.3.231, Math. Oper. Res. 3 (1978), 231–243 (1978) Zbl0393.90094MR0506661DOI10.1287/moor.3.3.231
  19. Zolotarev V. M., On stochastic continuity of queueing systems of type G | G | 1 , Theory Probab. Appl. 21 (1976), 250–269 (1976) MR0420920

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.