Estimation and adaptive control of span-contracting Markov decision processes

Gerhard Hübner

Displaying similar documents to “Estimation and adaptive control of span-contracting Markov decision processes”

Approximation and adaptive control of Markov processes: Average reward criterion

Onésimo Hernández-Lerma (1987)

Kybernetika

Similarity:

Estimation and control in finite Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena, Raúl Montes-de-Oca (2004)

Applicationes Mathematicae

Similarity:

This work concerns Markov decision chains with finite state and action sets. The transition law satisfies the simultaneous Doeblin condition but is unknown to the controller, and the problem of determining an optimal adaptive policy with respect to the average reward criterion is addressed. A subset of policies is identified so that, when the system evolves under a policy in that class, the frequency estimators of the transition law are consistent on an essential set of admissible state-action...

Ergodic control of partially observed Markov processes with equivalent transition probabilities

Łukasz Stettner (1993)

Applicationes Mathematicae

Similarity:

Optimal control with long run average cost functional of a partially observed Markov process is considered. Under the assumption that the transition probabilities are equivalent, the existence of the solution to the Bellman equation is shown, with the use of which optimal strategies are constructed.

Time-discretization for controlled Markov processes. I. General approximation results

Nico M. van Dijk, Arie Hordijk (1996)

Kybernetika

Similarity: