Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Rolando Cavazos-Cadena

Displaying similar documents to “Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs”

Solution to the optimality equation in a class of Markov decision chains with the average cost criterion

Rolando Cavazos-Cadena (1991)

Kybernetika

Similarity:

A semimartingale characterization of average optimal stationary policies for Markov decision processes.

Zhu, Quanxin, Guo, Xianping (2006)

Journal of Applied Mathematics and Stochastic Analysis

Similarity:

Sample-path average cost optimality for semi-Markov control processes on Borel spaces: unbounded costs and mean holding times

Oscar Vega-Amaya, Fernando Luque-Vásquez (2000)

Applicationes Mathematicae

Similarity:

We deal with semi-Markov control processes (SMCPs) on Borel spaces with unbounded cost and mean holding time. Under suitable growth conditions on the cost function and the mean holding time, together with stability properties of the embedded Markov chains, we show the equivalence of several average cost criteria as well as the existence of stationary optimal policies with respect to each of these criteria.

Sample path average optimality of Markov control processes with strictly unbounded cost

Oscar Vega-Amaya (1999)

Applicationes Mathematicae

Similarity:

We study the existence of sample path average cost (SPAC-) optimal policies for Markov control processes on Borel spaces with strictly unbounded costs, i.e., costs that grow without bound on the complement of compact subsets. Assuming only that the cost function is lower semicontinuous and that the transition law is weakly continuous, we show the existence of a relaxed policy with 'minimal' expected average cost and that the optimal average cost is the limit of discounted programs. Moreover,...

Identification of optimal policies in Markov decision processes

Karel Sladký (2010)

Kybernetika

Similarity:

In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the modified value iterations it is possible to eliminate suboptimal actions and to identify an optimal...