Optimal stationary policies inrisk-sensitive dynamic programs with finite state spaceand nonnegative rewards

Rolando Cavazos-Cadena; Raúl Montes-de-Oca

Applicationes Mathematicae (2000)

  • Volume: 27, Issue: 2, page 167-185
  • ISSN: 1233-7234

Abstract

top
This work concerns controlled Markov chains with finite state space and nonnegative rewards; it is assumed that the controller has a constant risk-sensitivity, and that the performance ofa control policy is measured by a risk-sensitive expected total-reward criterion. The existence of optimal stationary policies isstudied within this context, and the main resultestablishes the optimalityof a stationary policy achieving the supremum in the correspondingoptimality equation, whenever the associated Markov chain hasa unique positive recurrent class. Two explicit examples are providedto show that, if such an additional condition fails, an optimal stationarypolicy cannot be generally guaranteed. The results of this note, which consider both the risk-seeking and the risk-averse cases, answer an extended version of a question recently posed in Puterman (1994).

How to cite

top

Cavazos-Cadena, Rolando, and Montes-de-Oca, Raúl. "Optimal stationary policies inrisk-sensitive dynamic programs with finite state spaceand nonnegative rewards." Applicationes Mathematicae 27.2 (2000): 167-185. <http://eudml.org/doc/219265>.

@article{Cavazos2000,
abstract = {This work concerns controlled Markov chains with finite state space and nonnegative rewards; it is assumed that the controller has a constant risk-sensitivity, and that the performance ofa control policy is measured by a risk-sensitive expected total-reward criterion. The existence of optimal stationary policies isstudied within this context, and the main resultestablishes the optimalityof a stationary policy achieving the supremum in the correspondingoptimality equation, whenever the associated Markov chain hasa unique positive recurrent class. Two explicit examples are providedto show that, if such an additional condition fails, an optimal stationarypolicy cannot be generally guaranteed. The results of this note, which consider both the risk-seeking and the risk-averse cases, answer an extended version of a question recently posed in Puterman (1994).},
author = {Cavazos-Cadena, Rolando, Montes-de-Oca, Raúl},
journal = {Applicationes Mathematicae},
keywords = {unichain property; Markov decision processes; risk-sensitive optimality equation; risk-sensitive expected total- reward criterion},
language = {eng},
number = {2},
pages = {167-185},
title = {Optimal stationary policies inrisk-sensitive dynamic programs with finite state spaceand nonnegative rewards},
url = {http://eudml.org/doc/219265},
volume = {27},
year = {2000},
}

TY - JOUR
AU - Cavazos-Cadena, Rolando
AU - Montes-de-Oca, Raúl
TI - Optimal stationary policies inrisk-sensitive dynamic programs with finite state spaceand nonnegative rewards
JO - Applicationes Mathematicae
PY - 2000
VL - 27
IS - 2
SP - 167
EP - 185
AB - This work concerns controlled Markov chains with finite state space and nonnegative rewards; it is assumed that the controller has a constant risk-sensitivity, and that the performance ofa control policy is measured by a risk-sensitive expected total-reward criterion. The existence of optimal stationary policies isstudied within this context, and the main resultestablishes the optimalityof a stationary policy achieving the supremum in the correspondingoptimality equation, whenever the associated Markov chain hasa unique positive recurrent class. Two explicit examples are providedto show that, if such an additional condition fails, an optimal stationarypolicy cannot be generally guaranteed. The results of this note, which consider both the risk-seeking and the risk-averse cases, answer an extended version of a question recently posed in Puterman (1994).
LA - eng
KW - unichain property; Markov decision processes; risk-sensitive optimality equation; risk-sensitive expected total- reward criterion
UR - http://eudml.org/doc/219265
ER -

References

top
  1. M. G. Ávila-Godoy (1998), Controlled Markov chains with exponentialrisk-sensitive criteria: modularity, structured policies and applications, Ph.D. Dissertation, Dept. of Math., Univ. ofArizona, Tucson, AZ. 
  2. R. Cavazos-Cadena and E. Fernández-Gaucherand (1999), Controlled Markov chains with risk-sensitive criteria:average cost, optimality equations, and optimal solutions, Math. Methods Oper. Res. 43, 121-139. Zbl0953.93077
  3. R. Cavazos-Cadena and R. Montes-de-Oca (1999), Optimal stationarypolicies in controlled Markov chains with theexpected total-reward criterion, Research Report No. 1.01.010.99, Univ. Autónoma Metropolitana, Campus Iztapalapa, México, D.F. Zbl0937.90114
  4. P. C. Fishburn (1970), Utility Theory for Decision Making, Wiley, New York. Zbl0213.46202
  5. W. H. Fleming and D. Hernández-Hernández (1997), Risk-sensitive control of finite machines on an infinite horizon I, SIAM J. Control Optim. 35, 1790-1810. Zbl0891.93085
  6. O. Hernández-Lerma (1989), Adaptive Markov Control Processes, Springer, New York. Zbl0698.90053
  7. K. Hinderer (1970), Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lecture Notes in Oper. Res. 33, Springer, New York. Zbl0202.18401
  8. R. A. Howard and J. E. Matheson (1972), Risk-sensitive Markov decisionprocesses, Management Sci. 18, 356-369. Zbl0238.90007
  9. M. Loève (1977), Probability Theory I, 4th ed., Springer, New York. Zbl0359.60001
  10. J. W. Pratt (1964), Risk aversion in the small and in the large, Econometrica 32, 122-136. Zbl0132.13906
  11. M. L. Puterman (1994), Markov Decision Processes, Wiley, New York. Zbl0829.90134
  12. S. M. Ross (1970), Applied Probability Models with Optimization Applications, Holden-Day, San Francisco. Zbl0213.19101
  13. R. Strauch (1966), Negative dynamic programming, Ann.Math. Statist. 37, 871-890. Zbl0144.43201

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.