Identification of optimal policies in Markov decision processes

Karel Sladký

Approximations for Markov decision problems

Gerhard Hübner (1983)

Acta Universitatis Carolinae. Mathematica et Physica

Similarity:

A stopping rule for discounted Markov decision processes with finite action sets

Raúl Montes-de-Oca, Enrique Lemus-Rodríguez, Daniel Cruz-Suárez (2009)

Kybernetika

Similarity:

In a Discounted Markov Decision Process (DMDP) with finite action sets the Value Iteration Algorithm, under suitable conditions, leads to an optimal policy in a finite number of steps. Determining an upper bound on the necessary number of steps till gaining convergence is an issue of great theoretical and practical interest as it would provide a computationally feasible stopping rule for value iteration as an algorithm for finding an optimal policy. In this paper we find such a bound...

Displaying similar documents to “Identification of optimal policies in Markov decision processes”

Approximations for Markov decision problems

A stopping rule for discounted Markov decision processes with finite action sets

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs