An optimality system for finite average Markov decision chains under risk-aversion

Alfredo Alanís-Durán; Rolando Cavazos-Cadena

Kybernetika (2012)

  • Volume: 48, Issue: 1, page 83-104
  • ISSN: 0023-5954

Abstract

top
This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the optimal superior and inferior limit average cost functions coincide.

How to cite

top

Alanís-Durán, Alfredo, and Cavazos-Cadena, Rolando. "An optimality system for finite average Markov decision chains under risk-aversion." Kybernetika 48.1 (2012): 83-104. <http://eudml.org/doc/247175>.

@article{Alanís2012,
abstract = {This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the optimal superior and inferior limit average cost functions coincide.},
author = {Alanís-Durán, Alfredo, Cavazos-Cadena, Rolando},
journal = {Kybernetika},
keywords = {partition of the state space; nonconstant optimal average cost; discounted approximations to the risk-sensitive average cost criterion; equality of superior and inferior limit risk-averse average criteria; partition of the state space; nonconstant optimal average cost; equality of superior and inferior limit risk-averse average criteria},
language = {eng},
number = {1},
pages = {83-104},
publisher = {Institute of Information Theory and Automation AS CR},
title = {An optimality system for finite average Markov decision chains under risk-aversion},
url = {http://eudml.org/doc/247175},
volume = {48},
year = {2012},
}

TY - JOUR
AU - Alanís-Durán, Alfredo
AU - Cavazos-Cadena, Rolando
TI - An optimality system for finite average Markov decision chains under risk-aversion
JO - Kybernetika
PY - 2012
PB - Institute of Information Theory and Automation AS CR
VL - 48
IS - 1
SP - 83
EP - 104
AB - This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the optimal superior and inferior limit average cost functions coincide.
LA - eng
KW - partition of the state space; nonconstant optimal average cost; discounted approximations to the risk-sensitive average cost criterion; equality of superior and inferior limit risk-averse average criteria; partition of the state space; nonconstant optimal average cost; equality of superior and inferior limit risk-averse average criteria
UR - http://eudml.org/doc/247175
ER -

References

top
  1. A. Arapstathis, V. K. Borkar, E. Fernández-Gaucherand, M. K. Gosh, S. I. Marcus, 10.1137/0331018, SIAM J. Control Optim. 31 (1993), 282-334. (1993) MR1205981DOI10.1137/0331018
  2. P. Billingsley, Probability and Measure., Third edition. Wiley, New York 1995. (1995) Zbl0822.60002MR1324786
  3. R. Cavazos-Cadena, E. Fernández-Gaucherand, Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions., {Math. Method Optim. Res.} 43 (1999), 121-139. (1999) Zbl0953.93077MR1687362
  4. R. Cavazos-Cadena, E. Fernández-Gaucherand, Risk-sensitive control in communicating average Markov decision chains., In: { Modelling Uncertainty: An examination of Stochastic Theory, Methods and Applications} (M. Dror, P. L'Ecuyer and F. Szidarovsky, eds.), Kluwer, Boston 2002, pp. 525-544. (2002) 
  5. R. Cavazos-Cadena, 10.1007/s001860200256, {Math. Method Optim. Res.} 57 (2003), 263-285. (2003) Zbl1023.90076MR1973378DOI10.1007/s001860200256
  6. R. Cavazos-Cadena, D. Hernández-Hernández, 10.1214/105051604000000585, {Ann. App. Probab.}, 15 (2005), 175-212. (2005) Zbl1076.93045MR2115041DOI10.1214/105051604000000585
  7. R. Cavazos-Cadena, D. Hernández-Hernández, 10.1007/s00245-005-0840-3, {Appl. Math. Optim.} 53 (2006), 101-119. (2006) MR2190228DOI10.1007/s00245-005-0840-3
  8. R. Cavazos-Cadena, F. Salem-Silva, 10.1007/s00245-009-9080-2, { Appl. Math. Optim.} 61 (2009), 167-190. (2009) MR2585141DOI10.1007/s00245-009-9080-2
  9. G. B. Di Masi, L. Stettner, 10.1137/S0363012997320614, {SIAM J. Control Optim.} 38 1999, 61-78. (1999) Zbl0946.93043MR1740607DOI10.1137/S0363012997320614
  10. G. B. Di Masi, L. Stettner, 10.1016/S0167-6911(99)00118-8, {Syst. Control Lett.} 40 (2000), 15-20. (2000) Zbl0977.93083MR1829070DOI10.1016/S0167-6911(99)00118-8
  11. G. B. Di Masi, L. Stettner, 10.1137/040618631, {SIAM J. Control Optim.} 46 (2007), 231-252. (2007) Zbl1141.93067MR2299627DOI10.1137/040618631
  12. W. H. Fleming, W. M. McEneany, 10.1137/S0363012993258720, {SIAM J. Control Optim.} 33 (1995), 1881-1915. (1995) MR1358100DOI10.1137/S0363012993258720
  13. F. R. Gantmakher, The Theory of Matrices., {Chelsea}, London 1959. (1959) 
  14. D. Hernández-Hernández, S. I. Marcus, 10.1016/S0167-6911(96)00051-5, {Syst. Control Lett.} 29 (1996), 147-155. (1996) Zbl0866.93101MR1422212DOI10.1016/S0167-6911(96)00051-5
  15. D. Hernández-Hernández, S. I. Marcus, 10.1007/s002459900126, {Appl. Math. Optim.} 40 (1999), 273-285. (1999) Zbl0937.90115MR1709324DOI10.1007/s002459900126
  16. A. R. Howard, J. E. Matheson, 10.1287/mnsc.18.7.356, {Management Sci.} 18 (1972), 356-369. (1972) Zbl0238.90007MR0292497DOI10.1287/mnsc.18.7.356
  17. D. H. Jacobson, 10.1109/TAC.1973.1100265, {IEEE Trans. Automat. Control} 18 (1973), 124-131. (1973) MR0441523DOI10.1109/TAC.1973.1100265
  18. S. C. Jaquette, 10.1214/aos/1176342415, {Ann. Statist.} 1 (1973), 496-505. (1973) MR0378839DOI10.1214/aos/1176342415
  19. S. C. Jaquette, 10.1287/mnsc.23.1.43, {Management Sci.} 23 (1976), 43-49. (1976) Zbl0337.90053MR0439037DOI10.1287/mnsc.23.1.43
  20. A. Jaśkiewicz, 10.1214/105051606000000790, {Ann. App. Probab.} 17 (2007), 654-675. (2007) Zbl1128.93056MR2308338DOI10.1214/105051606000000790
  21. U. G. Rothblum, P. Whittle, 10.1287/moor.7.4.582, {Math. Oper. Res.} 7 (1982), 582-601. (1982) Zbl0498.90082MR0686533DOI10.1287/moor.7.4.582
  22. K. Sladký, Successive approximation methods for dynamic programming models., In: Proc. Third Formator Symposium on the Analysis of Large-Scale Systems (J. Beneš and L. Bakule, eds.), Academia, Prague 1979, pp. 171-189. (1979) Zbl0496.90081
  23. K. Sladký, Bounds on discrete dynamic programming recursions I., {Kybernetika} 16 (1980), 526-547. (1980) Zbl0454.90085MR0607292
  24. K. Sladký, Growth rates and average optimality in risk-sensitive Markov decision chains., {Kybernetika} 44 (2008), 205-226. (2008) Zbl1154.90612MR2428220
  25. K. Sladký, R. Montes-de-Oca, 10.1007/978-3-540-77903-2_11, In: Operations Research Proceedings, Vol. 2007, Part III (2008), pp. 69-74. (2008) Zbl1209.90348DOI10.1007/978-3-540-77903-2_11
  26. P. Whittle, Optimization Over Time-Dynamic Programming and Stochastic Control., Wiley, Chichester 1983. (1983) MR0710833
  27. W. H. M. Zijm, Nonnegative Matrices in Dynamic Programming., Mathematical Centre Tract, Amsterdam 1983. (1983) Zbl0526.90059MR0723868

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.