Markov decision processes on finite spaces with fuzzy total rewards

Karla Carrero-Vera; Hugo Cruz-Suárez; Raúl Montes-de-Oca

Kybernetika (2022)

  • Volume: 58, Issue: 2, page 180-199
  • ISSN: 0023-5954

Abstract

top
The paper concerns Markov decision processes (MDPs) with both the state and the decision spaces being finite and with the total reward as the objective function. For such a kind of MDPs, the authors assume that the reward function is of a fuzzy type. Specifically, this fuzzy reward function is of a suitable trapezoidal shape which is a function of a standard non-fuzzy reward. The fuzzy control problem consists of determining a control policy that maximizes the fuzzy expected total reward, where the maximization is made with respect to the partial order on the α -cuts of fuzzy numbers. The optimal policy and the optimal value function for the fuzzy optimal control problem are characterized by means of the dynamic programming equation of the standard optimal control problem and, as main conclusions, it is obtained that the optimal policy of the standard problem and the fuzzy one coincide and the fuzzy optimal value function is of a convenient trapezoidal form. As illustrations, fuzzy extensions of an optimal stopping problem and of a red-black gambling model are presented.

How to cite

top

Carrero-Vera, Karla, Cruz-Suárez, Hugo, and Montes-de-Oca, Raúl. "Markov decision processes on finite spaces with fuzzy total rewards." Kybernetika 58.2 (2022): 180-199. <http://eudml.org/doc/298894>.

@article{Carrero2022,
abstract = {The paper concerns Markov decision processes (MDPs) with both the state and the decision spaces being finite and with the total reward as the objective function. For such a kind of MDPs, the authors assume that the reward function is of a fuzzy type. Specifically, this fuzzy reward function is of a suitable trapezoidal shape which is a function of a standard non-fuzzy reward. The fuzzy control problem consists of determining a control policy that maximizes the fuzzy expected total reward, where the maximization is made with respect to the partial order on the $\alpha $-cuts of fuzzy numbers. The optimal policy and the optimal value function for the fuzzy optimal control problem are characterized by means of the dynamic programming equation of the standard optimal control problem and, as main conclusions, it is obtained that the optimal policy of the standard problem and the fuzzy one coincide and the fuzzy optimal value function is of a convenient trapezoidal form. As illustrations, fuzzy extensions of an optimal stopping problem and of a red-black gambling model are presented.},
author = {Carrero-Vera, Karla, Cruz-Suárez, Hugo, Montes-de-Oca, Raúl},
journal = {Kybernetika},
keywords = {Markov decision process; total reward; fuzzy reward; trapezoidal fuzzy number; optimal stopping problem; gambling model},
language = {eng},
number = {2},
pages = {180-199},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Markov decision processes on finite spaces with fuzzy total rewards},
url = {http://eudml.org/doc/298894},
volume = {58},
year = {2022},
}

TY - JOUR
AU - Carrero-Vera, Karla
AU - Cruz-Suárez, Hugo
AU - Montes-de-Oca, Raúl
TI - Markov decision processes on finite spaces with fuzzy total rewards
JO - Kybernetika
PY - 2022
PB - Institute of Information Theory and Automation AS CR
VL - 58
IS - 2
SP - 180
EP - 199
AB - The paper concerns Markov decision processes (MDPs) with both the state and the decision spaces being finite and with the total reward as the objective function. For such a kind of MDPs, the authors assume that the reward function is of a fuzzy type. Specifically, this fuzzy reward function is of a suitable trapezoidal shape which is a function of a standard non-fuzzy reward. The fuzzy control problem consists of determining a control policy that maximizes the fuzzy expected total reward, where the maximization is made with respect to the partial order on the $\alpha $-cuts of fuzzy numbers. The optimal policy and the optimal value function for the fuzzy optimal control problem are characterized by means of the dynamic programming equation of the standard optimal control problem and, as main conclusions, it is obtained that the optimal policy of the standard problem and the fuzzy one coincide and the fuzzy optimal value function is of a convenient trapezoidal form. As illustrations, fuzzy extensions of an optimal stopping problem and of a red-black gambling model are presented.
LA - eng
KW - Markov decision process; total reward; fuzzy reward; trapezoidal fuzzy number; optimal stopping problem; gambling model
UR - http://eudml.org/doc/298894
ER -

References

top
  1. Abbasbandy, S., Hajjari, T., , Comput. Math. Appl. 57 (2009), 413-419. MR2488614DOI
  2. Ban, A. I., , Fuzzy Sets and Systems 160 (2009), 3048-3058. MR2567092DOI
  3. Bartle, R. G., The Elements of Integration., Wiley, New York 1995. MR0200398
  4. Bellman, R. E., Zadeh, L. A., , Management Sci. 17 (1970), 141-164. MR0301613DOI
  5. Cavazos-Cadena, R., Montes-de-Oca, R., , Probab. Engrg. Inform. Sci. 15 (2001), 557-564. MR1852975DOI
  6. Chen, S. H., , Information Sci. 108 (1998), 149-155. Zbl0922.04007MR1632503DOI
  7. Diamond, P., Kloeden, P., Metric Spaces of Fuzzy Sets: Theory and Applications., World Scientific, Singapore 1994. MR1337027
  8. Driankov, D., Hellendoorn, H., Reinfrank, M., An Introduction to Fuzzy Control., Springer Science and Business Media, New York 2013. MR3010569
  9. Efendi, R., Arbaiy, N., Deris, M. M., , Information Sci. 441 (2018), 113-132. MR3771167DOI
  10. Fakoor, M., Kosari, A., Jafarzadeh, M., , J. Appl. Res. Tech. 14 (2016), 300-310. DOI
  11. Furukawa, N., , Optimization 40 (1997), 171-192. MR1620380DOI
  12. Kurano, M., Yasuda, M., Nakagami, J., Yoshida, Y., Markov decision processes with fuzzy rewards., In: Proc. Int. Conf. on Nonlinear Analysis, Hirosaki 2002, pp. 221-232. MR1986973
  13. López-Díaz, M., Ralescu, D. A., , Comput. Statist. Data Anal. 51 (2006), 109-114. MR2297590DOI
  14. Pedrycz, W., , Fuzzy Sets and Systems 64 (1994), 21-30. MR1281283DOI
  15. Puri, M. L., Ralescu, D. A., , J. Math. Anal. Appl. 114 (1986), 402-422. MR0833596DOI
  16. Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic. First edition., Wiley-Interscience, California 2005. MR1270015
  17. Rezvani, S., Molani, M., Representation of trapezoidal fuzzy numbers with shape function., Ann. Fuzzy Math. Inform. 8 (2014), 89-112. MR3214770
  18. Ross, S., , Adv. Appl. Probab. 6 (1974), 593-606. MR0347381DOI
  19. Ross, S., Introduction to Stochastic Dynamic Programming., Academic Press, New York 1983. MR0749232
  20. Semmouri, A., Jourhmane, M., Belhallaj, Z., , Ann. Oper. Res. 295 (2020), 769-786. MR4181708DOI
  21. Syropoulos, A., Grammenos, T., A Modern Introduction to Fuzzy Mathematics., Wiley, New Jersey 2020. 
  22. Zadeh, L., , Inform. Control 8 (1965), 338-353. Zbl0942.00007MR0219427DOI
  23. Zeng, W., Li, H., , Int. J. Approx. Reason. 46 (2007), 137-150. MR2362230DOI

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.