Displaying similar documents to “On self-optimizing control of Markov processes”

Semi-Markov control processes with non-compact action spaces and discontinuous costs

Anna Jaśkiewicz (2009)

Applicationes Mathematicae

Similarity:

We establish the average cost optimality equation and show the existence of an (ε-)optimal stationary policy for semi-Markov control processes without compactness and continuity assumptions. The only condition we impose on the model is the V-geometric ergodicity of the embedded Markov chain governed by a stationary policy.

Estimates for perturbations of general discounted Markov control chains

Raúl Montes-de-Oca, Alexander Sakhanenko, Francisco Salem-Silva (2003)

Applicationes Mathematicae

Similarity:

We extend previous results of the same authors ([11]) on the effects of perturbation in the transition probability of a Markov cost chain for discounted Markov control processes. Supposing valid, for each stationary policy, conditions of Lyapunov and Harris type, we get upper bounds for the index of perturbations, defined as the difference of the total expected discounted costs for the original Markov control process and the perturbed one. We present examples that satisfy our conditions. ...

Another set of verifiable conditions for average Markov decision processes with Borel spaces

Xiaolong Zou, Xianping Guo (2015)

Kybernetika

Similarity:

In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also...

Optimal stopping for Markov Processes

Massimo Lorenzani (1981)

Atti della Accademia Nazionale dei Lincei. Classe di Scienze Fisiche, Matematiche e Naturali. Rendiconti

Similarity:

In questa nota presentiamo dei nuovi risultati sul problema di tempo d’arresto ottimale per processi di Markov con tempo discreto.

Average cost Markov control processes with weighted norms: existence of canonical policies

Evgueni Gordienko, Onésimo Hernández-Lerma (1995)

Applicationes Mathematicae

Similarity:

This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.

On risk sensitive control of regular step Markov processes

Roman Sadowy (2001)

Applicationes Mathematicae

Similarity:

Risk-sensitive control problem of regular step Markov processes is considered, firstly when the control parameters are changed at shift times and then in the general case.

Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko, Francisco Salem-Silva (2000)

Kybernetika

Similarity:

For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend...

Estimation and control in finite Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena, Raúl Montes-de-Oca (2004)

Applicationes Mathematicae

Similarity:

This work concerns Markov decision chains with finite state and action sets. The transition law satisfies the simultaneous Doeblin condition but is unknown to the controller, and the problem of determining an optimal adaptive policy with respect to the average reward criterion is addressed. A subset of policies is identified so that, when the system evolves under a policy in that class, the frequency estimators of the transition law are consistent on an essential set of admissible state-action...

Deterministic optimal policies for Markov control processes with pathwise constraints

Armando F. Mendoza-Pérez, Onésimo Hernández-Lerma (2012)

Applicationes Mathematicae

Similarity:

This paper deals with discrete-time Markov control processes in Borel spaces with unbounded rewards. Under suitable hypotheses, we show that a randomized stationary policy is optimal for a certain expected constrained problem (ECP) if and only if it is optimal for the corresponding pathwise constrained problem (pathwise CP). Moreover, we show that a certain parametric family of unconstrained optimality equations yields convergence properties that lead to an approximation scheme which...