Partially observable Markov decision processes with partially observable random discount factors
E. Everardo Martinez-Garcia; J. Adolfo Minjárez-Sosa; Oscar Vega-Amaya
Kybernetika (2022)
- Volume: 58, Issue: 6, page 960-983
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topMartinez-Garcia, E. Everardo, Minjárez-Sosa, J. Adolfo, and Vega-Amaya, Oscar. "Partially observable Markov decision processes with partially observable random discount factors." Kybernetika 58.6 (2022): 960-983. <http://eudml.org/doc/299522>.
@article{Martinez2022,
abstract = {This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process.},
author = {Martinez-Garcia, E. Everardo, Minjárez-Sosa, J. Adolfo, Vega-Amaya, Oscar},
journal = {Kybernetika},
keywords = {partially observable systems; discounted criterion; random discount factors; queueing models; optimal policies},
language = {eng},
number = {6},
pages = {960-983},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Partially observable Markov decision processes with partially observable random discount factors},
url = {http://eudml.org/doc/299522},
volume = {58},
year = {2022},
}
TY - JOUR
AU - Martinez-Garcia, E. Everardo
AU - Minjárez-Sosa, J. Adolfo
AU - Vega-Amaya, Oscar
TI - Partially observable Markov decision processes with partially observable random discount factors
JO - Kybernetika
PY - 2022
PB - Institute of Information Theory and Automation AS CR
VL - 58
IS - 6
SP - 960
EP - 983
AB - This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process.
LA - eng
KW - partially observable systems; discounted criterion; random discount factors; queueing models; optimal policies
UR - http://eudml.org/doc/299522
ER -
References
top- Bensoussan, A., Cakanyildirim, M., Sethi, S. P., , SIAM J. Control Optim. 46 (2007), 176-209. DOI
- Bertsekas, D. P., Shreve, S. E., , Academic Press, New York 1978. Zbl0633.93001MR0511544DOI
- Carmon, Y., Shwartz, A., , Oper. Res. Lett. 37 (2009), 51-55. Zbl1154.90610MR2488083DOI
- Cruz-Suárez, H., Montes-de-Oca, R., Discounted Markov control processes induced by deterministic systems., Kybernetika 42 (2006), 647-664. MR2296506
- Dynkin, E. B., Yushkevich, A. A., , Springer-Verlag, New York 1979. MR0554083DOI
- Elliott, R. J., Aggoun, L., Moore, J. B., , Springer-Verlag, New York 1994. MR1323178DOI
- Feinberg, E. A., Shwartz, A., , IEEE Trans. Automat. Control 44 (1999), 628-631. Zbl0957.90127MR1680195DOI
- González-Hernández, J., López-Martínez, R R., Minjárez-Sosa, J. A., , Kybernetika 45 (2009), 737-754. MR2599109DOI
- González-Hernández, J., López-Martínez, R. R., Minjárez-Sosa, J. A., R.Gabriel-Arguelles, J., , Optim. Control Appl. Meth. 35 (2014), 575-591. MR3262763DOI
- García, Y. H., Diaz-Infante, S., Minjarez-Sosa, J. A., , Kybernetika 57 (2021), 493-512. MR4299460DOI
- Gordienko, E- I-, Salem, F. S., , Syst. Control Lett. 33 (1998), 125-130. MR1607814DOI
- Gordienko, E., Lemus-Rodríguez, E., Montes-de-Oca, R., , Math. Methods Oper. Res. 68 (2008), 77-96. MR2429561DOI
- Gordienko, E., Minjarez-Sosa, J. A., Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion., Kybernetika 34 (1998), 217-234. MR1621512
- Hernandez-Lerma, O., , Springer-Verlag, New York 1989. MR0995463DOI
- Hernandez-Lerma, O., Runggaldier, W., Monotone approximations for convex stochastic control problems., J. Math. Syst. Estim. Control 4 (1994), 99-140. MR1298550
- Hernandez-Lerma, O., Munoz-de-Ozak, M., , Kybernetika 28 (1992), 191-221. MR1174656DOI
- Hernández-Lerma, O., Lasserre, J. B., Discrete-Time Markov Control Processes: Basic Optimality Criteria., Springer-Verlag, New York 1996. Zbl0840.93001MR1363487
- Hilgert, N., Minjarez-Sosa, J. A., , Math. Methods Oper. Res. 54 (2001), 491-505. MR1890916DOI
- Hinderer, K., Foundations of Non-stationary Dynamic Programming with Discrete Time parameter., In: Lecture Notes Oper. Res. 33, Springer, New York 1979. MR0267890
- Jasso-Fuentes, H., Menaldi, J. L., Prieto-Rumeau, T., , Math. Methods Oper. Res. 92 (2020), 377-399. MR4182024DOI
- Minjarez-Sosa, J. A., , Kybernetika 40 (2004), 681-690. MR2120390DOI
- Minjarez-Sosa, J. A., , TOP 23 (2015), 743-772. MR3407674DOI
- Rieder, U., , Manuscripta Math. 24 (1978), 115-131. Zbl0385.28005MR0493590DOI
- Runggaldier, W. J., Stettner, L., , Applied Mathematics Monographs CNR 6, Giardini, Pisa 1994. DOI
- Striebel, C., 10.1007/978-3-642-45470-7, Lecture Notes Econ. Math. Syst. 110, Springer-Verlag, Berlin 1975. MR0414212DOI10.1007/978-3-642-45470-7
- Wei, Q., Guo, X., , Oper. Res. Lett. 39 (2011), 368-274. MR2835530DOI
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.