Displaying similar documents to “Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion”

Approximation and estimation in Markov control processes under a discounted criterion

J. Adolfo Minjárez-Sosa (2004)

Kybernetika

Similarity:

We consider a class of discrete-time Markov control processes with Borel state and action spaces, and k -valued i.i.d. disturbances with unknown density ρ . Supposing possibly unbounded costs, we combine suitable density estimation methods of ρ with approximation procedures of the optimal cost function, to show the existence of a sequence { f ^ t } of minimizers converging to an optimal stationary policy f .

Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko, Francisco Salem-Silva (2000)

Kybernetika

Similarity:

For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend...

Estimation and control in finite Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena, Raúl Montes-de-Oca (2004)

Applicationes Mathematicae

Similarity:

This work concerns Markov decision chains with finite state and action sets. The transition law satisfies the simultaneous Doeblin condition but is unknown to the controller, and the problem of determining an optimal adaptive policy with respect to the average reward criterion is addressed. A subset of policies is identified so that, when the system evolves under a policy in that class, the frequency estimators of the transition law are consistent on an essential set of admissible state-action...