The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

Displaying similar documents to “Identification of optimal policies in Markov decision processes”

A stopping rule for discounted Markov decision processes with finite action sets

Raúl Montes-de-Oca, Enrique Lemus-Rodríguez, Daniel Cruz-Suárez (2009)

Kybernetika

Similarity:

In a Discounted Markov Decision Process (DMDP) with finite action sets the Value Iteration Algorithm, under suitable conditions, leads to an optimal policy in a finite number of steps. Determining an upper bound on the necessary number of steps till gaining convergence is an issue of great theoretical and practical interest as it would provide a computationally feasible stopping rule for value iteration as an algorithm for finding an optimal policy. In this paper we find such a bound...