Approximations for Markov decision problems
Gerhard Hübner (1983)
Acta Universitatis Carolinae. Mathematica et Physica
Similarity:
Gerhard Hübner (1983)
Acta Universitatis Carolinae. Mathematica et Physica
Similarity:
Raúl Montes-de-Oca, Enrique Lemus-Rodríguez, Daniel Cruz-Suárez (2009)
Kybernetika
Similarity:
In a Discounted Markov Decision Process (DMDP) with finite action sets the Value Iteration Algorithm, under suitable conditions, leads to an optimal policy in a finite number of steps. Determining an upper bound on the necessary number of steps till gaining convergence is an issue of great theoretical and practical interest as it would provide a computationally feasible stopping rule for value iteration as an algorithm for finding an optimal policy. In this paper we find such a bound...
Rolando Cavazos-Cadena (1989)
Kybernetika
Similarity: