Approximation and adaptive control of Markov processes: Average reward criterion
Onésimo Hernández-Lerma (1987)
Kybernetika
Similarity:
Onésimo Hernández-Lerma (1987)
Kybernetika
Similarity:
Rolando Cavazos-Cadena, Raúl Montes-de-Oca (2004)
Applicationes Mathematicae
Similarity:
This work concerns Markov decision chains with finite state and action sets. The transition law satisfies the simultaneous Doeblin condition but is unknown to the controller, and the problem of determining an optimal adaptive policy with respect to the average reward criterion is addressed. A subset of policies is identified so that, when the system evolves under a policy in that class, the frequency estimators of the transition law are consistent on an essential set of admissible state-action...
Łukasz Stettner (1993)
Applicationes Mathematicae
Similarity:
Optimal control with long run average cost functional of a partially observed Markov process is considered. Under the assumption that the transition probabilities are equivalent, the existence of the solution to the Bellman equation is shown, with the use of which optimal strategies are constructed.
Nico M. van Dijk, Arie Hordijk (1996)
Kybernetika
Similarity:
Gerhard Hübner (1983)
Acta Universitatis Carolinae. Mathematica et Physica
Similarity: