Estimation and control in finite Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena; Raúl Montes-de-Oca

Displaying similar documents to “Estimation and control in finite Markov decision processes with the average reward criterion”

Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion

Evgueni I. Gordienko, J. Adolfo Minjárez-Sosa (1998)

Kybernetika

Similarity:

We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations $x_{t + 1} = F (x_{t}, a_{t}, ξ_{t}), t = 0, 1, ...$ with i.i.d. $ℜ^{k}$ -valued random vectors $ξ_{t}$ whose density $ρ$ is unknown. Assuming observability of $ξ_{t}$ we propose the procedure of statistical estimation of $ρ$ that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded...

Estimation and adaptive control of span-contracting Markov decision processes

Gerhard Hübner (1991)

Kybernetika

Similarity:

Approximation and adaptive control of Markov processes: Average reward criterion

Onésimo Hernández-Lerma (1987)

Kybernetika

Similarity:

On self-optimizing control of Markov processes

Petr Mandl (1985)

Banach Center Publications

Similarity:

Average cost Markov control processes with weighted norms: existence of canonical policies

Evgueni Gordienko, Onésimo Hernández-Lerma (1995)

Applicationes Mathematicae

Similarity:

This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.