Time-discretization for controlled Markov processes. I. General approximation results
Nico M. van Dijk, Arie Hordijk (1996)
Kybernetika
Similarity:
Nico M. van Dijk, Arie Hordijk (1996)
Kybernetika
Similarity:
Evgueni I. Gordienko, Francisco Salem-Silva (2000)
Kybernetika
Similarity:
For a discrete-time Markov control process with the transition probability , we compare the total discounted costs and , when applying the optimal control policy and its approximation . The policy is optimal for an approximating process with the transition probability . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index . This bound does not depend...
Armando F. Mendoza-Pérez, Onésimo Hernández-Lerma (2012)
Applicationes Mathematicae
Similarity:
This paper deals with discrete-time Markov control processes in Borel spaces with unbounded rewards. Under suitable hypotheses, we show that a randomized stationary policy is optimal for a certain expected constrained problem (ECP) if and only if it is optimal for the corresponding pathwise constrained problem (pathwise CP). Moreover, we show that a certain parametric family of unconstrained optimality equations yields convergence properties that lead to an approximation scheme which...
Evgueni I. Gordienko, J. Adolfo Minjárez-Sosa (1998)
Kybernetika
Similarity:
We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations with i.i.d. -valued random vectors whose density is unknown. Assuming observability of we propose the procedure of statistical estimation of that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded...
Onésimo Hernández-Lerma (1987)
Kybernetika
Similarity: