A semimartingale characterization of average optimal stationary policies for Markov decision processes.

Zhu, Quanxin; Guo, Xianping

Displaying similar documents to “A semimartingale characterization of average optimal stationary policies for Markov decision processes.”

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Rolando Cavazos-Cadena (1989)

Kybernetika

Similarity:

Deterministic optimal policies for Markov control processes with pathwise constraints

Armando F. Mendoza-Pérez, Onésimo Hernández-Lerma (2012)

Applicationes Mathematicae

Similarity:

This paper deals with discrete-time Markov control processes in Borel spaces with unbounded rewards. Under suitable hypotheses, we show that a randomized stationary policy is optimal for a certain expected constrained problem (ECP) if and only if it is optimal for the corresponding pathwise constrained problem (pathwise CP). Moreover, we show that a certain parametric family of unconstrained optimality equations yields convergence properties that lead to an approximation scheme which...

Average cost Markov control processes with weighted norms: existence of canonical policies

Evgueni Gordienko, Onésimo Hernández-Lerma (1995)

Applicationes Mathematicae

Similarity:

This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.

Solution to the optimality equation in a class of Markov decision chains with the average cost criterion

Rolando Cavazos-Cadena (1991)

Kybernetika

Similarity:

Approximation and adaptive control of Markov processes: Average reward criterion

Onésimo Hernández-Lerma (1987)

Kybernetika

Similarity:

Policy iteration for continuous-time average reward Markov decision processes in Polish spaces.

Zhu, Quanxin, Yang, Xinsong, Huang, Chuangxia (2009)

Abstract and Applied Analysis

Similarity: