Displaying similar documents to “Solutions of semi-Markov control models with recursive discount rates and approximation by ϵ -optimal policies”

Discrete-time Markov control processes with recursive discount rates

Yofre H. García, Juan González-Hernández (2016)

Kybernetika

Similarity:

This work analyzes a discrete-time Markov Control Model (MCM) on Borel spaces when the performance index is the expected total discounted cost. This criterion admits unbounded costs. It is assumed that the discount rate in any period is obtained by using recursive functions and a known initial discount rate. The classic dynamic programming method for finite-horizon case is verified. Under slight conditions, the existence of deterministic non-stationary optimal policies for infinite-horizon...

The exponential cost optimality for finite horizon semi-Markov decision processes

Haifeng Huo, Xian Wen (2022)

Kybernetika

Similarity:

This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the...

Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko, Francisco Salem-Silva (2000)

Kybernetika

Similarity:

For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend...

Limiting average cost control problems in a class of discrete-time stochastic systems

Nadine Hilgert, Onesimo Hernández-Lerma (2001)

Applicationes Mathematicae

Similarity:

We consider a class of d -valued stochastic control systems, with possibly unbounded costs. The systems evolve according to a discrete-time equation x t + 1 = G ( x t , a t ) + ξ t (t = 0,1,... ), for each fixed n = 0,1,..., where the ξ t are i.i.d. random vectors, and the Gₙ are given functions converging pointwise to some function G as n → ∞. Under suitable hypotheses, our main results state the existence of stationary control policies that are expected average cost (EAC) optimal and sample path average cost (SPAC)...

Estimates for perturbations of average Markov decision processes with a minimal state and upper bounded by stochastically ordered Markov chains

Raúl Montes-de-Oca, Francisco Salem-Silva (2005)

Kybernetika

Similarity:

This paper deals with Markov decision processes (MDPs) with real state space for which its minimum is attained, and that are upper bounded by (uncontrolled) stochastically ordered (SO) Markov chains. We consider MDPs with (possibly) unbounded costs, and to evaluate the quality of each policy, we use the objective function known as the average cost. For this objective function we consider two Markov control models and 1 . and 1 have the same components except for the transition laws....

Semi-Markov control processes with non-compact action spaces and discontinuous costs

Anna Jaśkiewicz (2009)

Applicationes Mathematicae

Similarity:

We establish the average cost optimality equation and show the existence of an (ε-)optimal stationary policy for semi-Markov control processes without compactness and continuity assumptions. The only condition we impose on the model is the V-geometric ergodicity of the embedded Markov chain governed by a stationary policy.

A two-disorder detection problem

Krzysztof Szajowski (1997)

Applicationes Mathematicae

Similarity:

Suppose that the process X = { X n , n } is observed sequentially. There are two random moments of time θ 1 and θ 2 , independent of X, and X is a Markov process given θ 1 and θ 2 . The transition probabilities of X change for the first time at time θ 1 and for the second time at time θ 2 . Our objective is to find a strategy which immediately detects the distribution changes with maximal probability based on observation of X. The corresponding problem of double optimal stopping is constructed. The optimal strategy...

An optimal strong equilibrium solution for cooperative multi-leader-follower Stackelberg Markov chains games

Kristal K. Trejo, Julio B. Clempner, Alexander S. Poznyak (2016)

Kybernetika

Similarity:

This paper presents a novel approach for computing the strong Stackelberg/Nash equilibrium for Markov chains games. For solving the cooperative n -leaders and m -followers Markov game we consider the minimization of the L p - norm that reduces the distance to the utopian point in the Euclidian space. Then, we reduce the optimization problem to find a Pareto optimal solution. We employ a bi-level programming method implemented by the extraproximal optimization approach for computing the strong...

Semi-Markov control models with average costs

Fernando Luque-Vásquez, Onésimo Hernández-Lerma (1999)

Applicationes Mathematicae

Similarity:

This paper studies semi-Markov control models with Borel state and control spaces, and unbounded cost functions, under the average cost criterion. Conditions are given for (i) the existence of a solution to the average cost optimality equation, and for (ii) the existence of strong optimal control policies. These conditions are illustrated with a semi-Markov replacement model.