Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

J. Minjárez-Sosa

Applicationes Mathematicae (1999)

  • Volume: 26, Issue: 3, page 267-280
  • ISSN: 1233-7234

Abstract

top
We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations x t + 1 = F ( x t , a t , ξ t ) , t=1,2,..., with i.i.d. k -valued random vectors ξ t , which are observable but whose density ϱ is unknown.

How to cite

top

Minjárez-Sosa, J.. "Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion." Applicationes Mathematicae 26.3 (1999): 267-280. <http://eudml.org/doc/219238>.

@article{Minjárez1999,
abstract = {We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations $x_\{t+1\}=F(x_t,a_t,ξ _t)$, t=1,2,..., with i.i.d. $ℝ^k$-valued random vectors $ξ_t$, which are observable but whose density ϱ is unknown.},
author = {Minjárez-Sosa, J.},
journal = {Applicationes Mathematicae},
keywords = {Markov control process; discounted and average cost criterion; adaptive policy},
language = {eng},
number = {3},
pages = {267-280},
title = {Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion},
url = {http://eudml.org/doc/219238},
volume = {26},
year = {1999},
}

TY - JOUR
AU - Minjárez-Sosa, J.
TI - Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion
JO - Applicationes Mathematicae
PY - 1999
VL - 26
IS - 3
SP - 267
EP - 280
AB - We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations $x_{t+1}=F(x_t,a_t,ξ _t)$, t=1,2,..., with i.i.d. $ℝ^k$-valued random vectors $ξ_t$, which are observable but whose density ϱ is unknown.
LA - eng
KW - Markov control process; discounted and average cost criterion; adaptive policy
UR - http://eudml.org/doc/219238
ER -

References

top
  1. [1] D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33 (1962), 719-726. Zbl0133.12906
  2. [2] E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes, Springer, New York, 1979. Zbl0073.34801
  3. [3] E. I. Gordienko, Adaptive strategies for certain classes of controlled Markov processes, Theory Probab. Appl. 29 (1985), 504-518. Zbl0577.93067
  4. [4] E. I. Gordienko and O. Hernández-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies, Appl. Math. (Warsaw) 23 (1995), 199-218. Zbl0829.93067
  5. [5] E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion, Kybernetika 34 (1998), no. 2, 217-234. Zbl1274.90474
  6. [6] E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: average criterion, Math. Methods Oper. Res. 48 (1998), 37-55. Zbl0952.90043
  7. [7] R. Hasminskii and I. Ibragimov, On density estimation in the view of Kolmogorov's ideas in approximation theory, Ann. Statist. 18 (1990), 999-1010. Zbl0705.62039
  8. [8] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989. 
  9. [9] O. Hernández-Lerma, Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality, Reporte Interno 165, Departamento de Matemáticas, CINVESTAV-IPN, México, 1994. Zbl0906.93062
  10. [10] O. Hernández-Lerma and R. Cavazos-Cadena, Density estimation and adaptive control of Markov processes: average and discounted criteria, Acta Appl. Math. 20 (1990), 285-307. Zbl0717.93066
  11. [11] S. A. Lippman, On dynamic programming with unbounded rewards, Manag. Sci. 21 (1975), 1225-1233. Zbl0309.90017
  12. [12] P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60. Zbl0281.60070
  13. [13] U. Rieder, Measurable selection theorems for optimization problems, Manuscripta Math. 24 (1978), 115-131. Zbl0385.28005
  14. [14] J. A. E. E. Van Nunen and J. Wessels, A note on dynamic programming with unbounded rewards, Manag. Sci. 24 (1978), 576-580. Zbl0374.49015

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.