Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

J. Minjárez-Sosa

Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

J. Minjárez-Sosa

Applicationes Mathematicae (1999)

Volume: 26, Issue: 3, page 267-280
ISSN: 1233-7234

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations

x_{t + 1} = F (x_{t}, a_{t}, ξ_{t})

, t=1,2,..., with i.i.d.

ℝ^{k}

-valued random vectors

ξ_{t}

, which are observable but whose density ϱ is unknown.

How to cite

top

MLA
BibTeX
RIS

Minjárez-Sosa, J.. "Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion." Applicationes Mathematicae 26.3 (1999): 267-280. <http://eudml.org/doc/219238>.

@article{Minjárez1999,
abstract = {We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations $x_\{t+1\}=F(x_t,a_t,ξ _t)$, t=1,2,..., with i.i.d. $ℝ^k$-valued random vectors $ξ_t$, which are observable but whose density ϱ is unknown.},
author = {Minjárez-Sosa, J.},
journal = {Applicationes Mathematicae},
keywords = {Markov control process; discounted and average cost criterion; adaptive policy},
language = {eng},
number = {3},
pages = {267-280},
title = {Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion},
url = {http://eudml.org/doc/219238},
volume = {26},
year = {1999},
}

TY - JOUR
AU - Minjárez-Sosa, J.
TI - Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion
JO - Applicationes Mathematicae
PY - 1999
VL - 26
IS - 3
SP - 267
EP - 280
AB - We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations $x_{t+1}=F(x_t,a_t,ξ _t)$, t=1,2,..., with i.i.d. $ℝ^k$-valued random vectors $ξ_t$, which are observable but whose density ϱ is unknown.
LA - eng
KW - Markov control process; discounted and average cost criterion; adaptive policy
UR - http://eudml.org/doc/219238
ER -

References

top

[1] D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33 (1962), 719-726. Zbl0133.12906
[2] E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes, Springer, New York, 1979. Zbl0073.34801
[3] E. I. Gordienko, Adaptive strategies for certain classes of controlled Markov processes, Theory Probab. Appl. 29 (1985), 504-518. Zbl0577.93067
[4] E. I. Gordienko and O. Hernández-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies, Appl. Math. (Warsaw) 23 (1995), 199-218. Zbl0829.93067
[5] E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion, Kybernetika 34 (1998), no. 2, 217-234. Zbl1274.90474
[6] E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: average criterion, Math. Methods Oper. Res. 48 (1998), 37-55. Zbl0952.90043
[7] R. Hasminskii and I. Ibragimov, On density estimation in the view of Kolmogorov's ideas in approximation theory, Ann. Statist. 18 (1990), 999-1010. Zbl0705.62039
[8] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989.
[9] O. Hernández-Lerma, Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality, Reporte Interno 165, Departamento de Matemáticas, CINVESTAV-IPN, México, 1994. Zbl0906.93062
[10] O. Hernández-Lerma and R. Cavazos-Cadena, Density estimation and adaptive control of Markov processes: average and discounted criteria, Acta Appl. Math. 20 (1990), 285-307. Zbl0717.93066
[11] S. A. Lippman, On dynamic programming with unbounded rewards, Manag. Sci. 21 (1975), 1225-1233. Zbl0309.90017
[12] P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60. Zbl0281.60070
[13] U. Rieder, Measurable selection theorems for optimization problems, Manuscripta Math. 24 (1978), 115-131. Zbl0385.28005
[14] J. A. E. E. Van Nunen and J. Wessels, A note on dynamic programming with unbounded rewards, Manag. Sci. 24 (1978), 576-580. Zbl0374.49015

Citations in EuDML Documents

top

Fernando Luque-Vásquez, J. Adolfo Minjárez-Sosa, Empirical approximation in Markov games under unbounded payoff: discounted and average criteria

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.