Risk-sensitive Markov stopping games with an absorbing state

Jaicer López-Rivero; Rolando Cavazos-Cadena; Hugo Cruz-Suárez

Kybernetika (2022)

  • Volume: 58, Issue: 1, page 101-122
  • ISSN: 0023-5954

Abstract

top
This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I. The performance of a pair of decision strategies is measured by the risk-sensitive (expected) total reward of player I and, besides mild continuity-compactness conditions, the main structural assumption on the model is the existence of an absorbing state which is accessible from any starting point. In this context, it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established.

How to cite

top

López-Rivero, Jaicer, Cavazos-Cadena, Rolando, and Cruz-Suárez, Hugo. "Risk-sensitive Markov stopping games with an absorbing state." Kybernetika 58.1 (2022): 101-122. <http://eudml.org/doc/297915>.

@article{López2022,
abstract = {This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I. The performance of a pair of decision strategies is measured by the risk-sensitive (expected) total reward of player I and, besides mild continuity-compactness conditions, the main structural assumption on the model is the existence of an absorbing state which is accessible from any starting point. In this context, it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established.},
author = {López-Rivero, Jaicer, Cavazos-Cadena, Rolando, Cruz-Suárez, Hugo},
journal = {Kybernetika},
keywords = {monotone operator; fixed point; equilibrium equation; hitting time; bounded rewards; certainty equivalent},
language = {eng},
number = {1},
pages = {101-122},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Risk-sensitive Markov stopping games with an absorbing state},
url = {http://eudml.org/doc/297915},
volume = {58},
year = {2022},
}

TY - JOUR
AU - López-Rivero, Jaicer
AU - Cavazos-Cadena, Rolando
AU - Cruz-Suárez, Hugo
TI - Risk-sensitive Markov stopping games with an absorbing state
JO - Kybernetika
PY - 2022
PB - Institute of Information Theory and Automation AS CR
VL - 58
IS - 1
SP - 101
EP - 122
AB - This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I. The performance of a pair of decision strategies is measured by the risk-sensitive (expected) total reward of player I and, besides mild continuity-compactness conditions, the main structural assumption on the model is the existence of an absorbing state which is accessible from any starting point. In this context, it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established.
LA - eng
KW - monotone operator; fixed point; equilibrium equation; hitting time; bounded rewards; certainty equivalent
UR - http://eudml.org/doc/297915
ER -

References

top
  1. Alanís-Durán, A., Cavazos-Cadena, R., An optimality system for finite average Markov decision chains under risk-aversion., Kybernetika 48 (2012), 83-104. MR2932929
  2. Altman, E., Shwartz, A., Constrained Markov games: Nash equilibria., In: Annals of Dynamic Games (V. Gaitsgory, J. Filar, and K. Mizukami, eds.), Birkhauser, Boston 2000, pp. 213-221. MR1764491
  3. Atar, R., Budhiraja, A., , Ann. Probab. 38 (2010), 2, 498-531. MR2642884DOI
  4. Balaji, S., Meyn, S. P., , Stoch. Proc. Appl. 90 (2000), 1, 123-144. MR1787128DOI
  5. Bäuerle, N., Rieder, U., Markov Decision Processes with Applications to Finance., Springer, New York 2011. Zbl1236.90004MR2808878
  6. Bäuerle, N., Rieder, U., , Math. Oper. Res. 39 (2014), 1, 105-120. MR3173005DOI
  7. Bäuerle, N., Rieder, U., , Stoch. Proc. Appl. 127 (2017), 2, 622-642. MR3583765DOI
  8. Bielecki, T. R., Hernández-Hernández, D., Pliska, S. R., , Mathematical Methods of OR 50 (1999), 167-188. Zbl0959.91029MR1732397DOI
  9. Borkar, V. S., Meyn, S. F., , Math. Oper. Res. 27 (2002), 1, 192-209. MR1886226DOI
  10. Cavazos-Cadena, R., Hernández-Hernández, D., , Appl. Math. Optim. 53 (2006), 101-119. MR2190228DOI
  11. Cavazos-Cadena, R., Hernández-Hernández, D., Nash equilibria in a class of Markov stopping games., Kybernetika 48 (2012), 5, 1027-1044. MR3086867
  12. Cavazos-Cadena, R., Rodríguez-Gutiérrez, L., Sánchez-Guillermo, D. M., , Kybernetika 57 (2021), 474-492. MR4299459DOI
  13. Denardo, E. V., Rothblum, U. G., , SIAM J. Control Optim. 45 (2006), 2, 414-431. MR2246083DOI
  14. Masi, G. B. Di, Stettner, L., , SIAM J. Control Optim. 38 (1999), 1, 61-78. MR1740607DOI
  15. Masi, G. B. Di, Stettner, L., , Syst. Control Lett. 40 (2000), 15-20. Zbl0977.93083MR1829070DOI
  16. Masi, G. B. Di, Stettner, L., , SIAM J. Control Optim. 46 (2007), 1, 231-252. MR2299627DOI
  17. A.Filar, J., Vrieze, O. J., Competitive Markov Decision Processes., Springer, New York 1996. MR1418636
  18. Hernández-Lerma, O., Adaptive Markov Control Processes., Springer, New York 1989. Zbl0677.93073MR0995463
  19. Howard, R. A., Matheson, J. E., , Manage. Sci. 18 (1972), 7, 349-463. MR0292497DOI
  20. Jaśkiewicz, A., , Ann. Appl. Probab. 17 (2007), 2, 654-675. MR2308338DOI
  21. Kolokoltsov, V. N., Malafeyev, O. A., Understanding Game Theory., World Scientific, Singapore 2010. Zbl1189.91001MR2666863
  22. Kontoyiannis, I., Meyn, S. P., , Ann. Appl. Probab. 13 (2003), 1, 304-362. MR1952001DOI
  23. Martínez-Cortés, V. M., , Kybernetika 57 (2021), 1, 1-14. MR4231853DOI
  24. Peskir, G., , Math. Finance 15 (2007), 169-181. Zbl1109.91028MR2116800DOI
  25. Peskir, G., Shiryaev, A., Optimal Stopping and Free-Boundary Problems., Birkhauser, Boston 2006. Zbl1115.60001MR2256030
  26. Pitera, M., Stettner, L., , Math. Meth. Oper. Res. 82 (2016), 2, 265-293. MR3489700DOI
  27. Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming., Wiley, New York 1994. Zbl1184.90170MR1270015
  28. Shapley, L. S., Stochastic games., Proc. National Academy Sci. 39 (1953), 10, 1095-1100. Zbl1180.91042MR0061807
  29. Shiryaev, A., Optimal Stopping Rules., Springer, New York 2008. Zbl1138.60008MR2374974
  30. Sladký, K., Growth rates and average optimality in risk-sensitive Markov decision chains., Kybernetika 44 (2008), 2, 205-226. MR2428220
  31. Sladký, K., , Kybernetika 54 (2018), 6, 1218-1230. MR3902630DOI
  32. Stettner, L., , Math. Meth. Oper. Res. 50 (1999), 3, 463-474. MR1731299DOI
  33. Zachrisson, L. E., Markov Games., Princeton University Press 12, Princeton 1964. MR0170729

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.