In a Discounted Markov Decision Process (DMDP) with finite action sets the Value Iteration Algorithm, under suitable conditions, leads to an optimal policy in a finite number of steps. Determining an upper bound on the necessary number of steps till gaining convergence is an issue of great theoretical and practical interest as it would provide a computationally feasible stopping rule for value iteration as an algorithm for finding an optimal policy. In this paper we find such a bound depending only...
This paper deals with a certain class of unbounded optimization problems. The optimization problems taken into account depend on a parameter. Firstly, there are established conditions which permit to guarantee the continuity with respect to the parameter of the minimum of the optimization problems under consideration, and the upper semicontinuity of the multifunction which applies each parameter into its set of minimizers. Besides, with the additional condition of uniqueness of the minimizer, its...
Many examples in optimization, ranging from Linear Programming to Markov Decision Processes (MDPs), present more than one optimal solution. The study of this non-uniqueness is of great mathematical interest. In this paper the authors show that in a specific family of discounted MDPs, non-uniqueness is a “fragile” property through Ekeland's Principle for each problem with at least two optimal policies; a perturbed model is produced with a unique optimal policy. This result not only supersedes previous...
Download Results (CSV)