Approximation and estimation in Markov control processes under a discounted criterion

J. Adolfo Minjárez-Sosa

Kybernetika (2004)

  • Volume: 40, Issue: 6, page [681]-690
  • ISSN: 0023-5954

Abstract

top
We consider a class of discrete-time Markov control processes with Borel state and action spaces, and k -valued i.i.d. disturbances with unknown density ρ . Supposing possibly unbounded costs, we combine suitable density estimation methods of ρ with approximation procedures of the optimal cost function, to show the existence of a sequence { f ^ t } of minimizers converging to an optimal stationary policy f .

How to cite

top

Minjárez-Sosa, J. Adolfo. "Approximation and estimation in Markov control processes under a discounted criterion." Kybernetika 40.6 (2004): [681]-690. <http://eudml.org/doc/33728>.

@article{Minjárez2004,
abstract = {We consider a class of discrete-time Markov control processes with Borel state and action spaces, and $\Re ^\{k\}$-valued i.i.d. disturbances with unknown density $\rho .$ Supposing possibly unbounded costs, we combine suitable density estimation methods of $\rho $ with approximation procedures of the optimal cost function, to show the existence of a sequence $\lbrace \hat\{f\}_\{t\}\rbrace $ of minimizers converging to an optimal stationary policy $f_\{\infty \}.$},
author = {Minjárez-Sosa, J. Adolfo},
journal = {Kybernetika},
keywords = {Markov control processes; density estimation; discounted cost criterion; Markov control process; density estimation; discounted cost criterion},
language = {eng},
number = {6},
pages = {[681]-690},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Approximation and estimation in Markov control processes under a discounted criterion},
url = {http://eudml.org/doc/33728},
volume = {40},
year = {2004},
}

TY - JOUR
AU - Minjárez-Sosa, J. Adolfo
TI - Approximation and estimation in Markov control processes under a discounted criterion
JO - Kybernetika
PY - 2004
PB - Institute of Information Theory and Automation AS CR
VL - 40
IS - 6
SP - [681]
EP - 690
AB - We consider a class of discrete-time Markov control processes with Borel state and action spaces, and $\Re ^{k}$-valued i.i.d. disturbances with unknown density $\rho .$ Supposing possibly unbounded costs, we combine suitable density estimation methods of $\rho $ with approximation procedures of the optimal cost function, to show the existence of a sequence $\lbrace \hat{f}_{t}\rbrace $ of minimizers converging to an optimal stationary policy $f_{\infty }.$
LA - eng
KW - Markov control processes; density estimation; discounted cost criterion; Markov control process; density estimation; discounted cost criterion
UR - http://eudml.org/doc/33728
ER -

References

top
  1. Cavazos-Cadena R., 10.1007/BF01102341, J. Optim. Theory Appl. 65 (1990), 191–207 (1990) Zbl0699.93053MR1051545DOI10.1007/BF01102341
  2. Devroye L., Gyorfi L., Nonparametric Density Estimation the L 1 View, Wiley, New York 1985 MR0780746
  3. Dynkin E. B., Yushkevich A. A., Controlled Markov Processes, Springer–Verlag, New York 1979 MR0554083
  4. Gordienko E. I., Adaptive strategies for certain classes of controlled Markov processes, Theory Probab. Appl. 29 (1985), 504–518 (1985) Zbl0577.93067
  5. Gordienko E. I., Minjárez-Sosa J. A., Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion, Kybernetika 34 (1998), 217–234 (1998) MR1621512
  6. Hasminskii R., Ibragimov I., 10.1214/aos/1176347736, Ann. Statist. 18 (1990), 999–1010 (1990) Zbl0705.62039MR1062695DOI10.1214/aos/1176347736
  7. Hernández-Lerma O., Adaptive Markov Control Processes, Springer–Verlag, New York 1989 MR0995463
  8. Hernández-Lerma O., Cavazos-Cadena R., 10.1007/BF00049572, Acta Appl. Math. 20 (1990), 285–307 (1990) MR1081591DOI10.1007/BF00049572
  9. Hernández-Lerma O., Lasserre J. B., Discrete-Time Markov Control Processes: Basic Optimality Criteria, Springer–Verlag, New York 1996 Zbl0840.93001MR1363487
  10. Hernández-Lerma O., Lasserre J. B., Further Topics on Discrete-Time Markov Control Processes, Springer–Verlag, New York 1999 Zbl0928.93002MR1697198
  11. Hernández-Lerma O., Marcus S. I., 10.1016/0167-6911(87)90055-7, Systems Control Lett. 9 (1987), 307–315 (1987) Zbl0637.93075MR0912683DOI10.1016/0167-6911(87)90055-7
  12. Hilgert N., Minjárez-Sosa J. A., 10.1007/s001860100170, Math. Methods Oper. Res. 54 (2001), 491–505 Zbl1042.93065MR1890916DOI10.1007/s001860100170
  13. Schäl M., 10.1007/BF00532612, Z. Wahrs. Verw. Gerb. 32 (1975), 179–196 (1975) MR0378841DOI10.1007/BF00532612

Citations in EuDML Documents

top
  1. Yofre H. García, Saul Diaz-Infante, J. Adolfo Minjárez-Sosa, Partially observable queueing systems with controlled service rates under a discounted optimality criterion
  2. Beatris A. Escobedo-Trujillo, Carmen G. Higuera-Chan, Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs
  3. E. Everardo Martinez-Garcia, J. Adolfo Minjárez-Sosa, Oscar Vega-Amaya, Partially observable Markov decision processes with partially observable random discount factors

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.