Displaying similar documents to “The exponential cost optimality for finite horizon semi-Markov decision processes”

Solutions of semi-Markov control models with recursive discount rates and approximation by ϵ -optimal policies

Yofre H. García, Juan González-Hernández (2019)

Kybernetika

Similarity:

This paper studies a class of discrete-time discounted semi-Markov control model on Borel spaces. We assume possibly unbounded costs and a non-stationary exponential form in the discount factor which depends of on a rate, called the discount rate. Given an initial discount rate the evolution in next steps depends on both the previous discount rate and the sojourn time of the system at the current state. The new results provided here are the existence and the approximation of optimal...

Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach

R. Israel Ortega-Gutiérrez, Raúl Montes-de-Oca, Enrique Lemus-Rodríguez (2016)

Kybernetika

Similarity:

Many examples in optimization, ranging from Linear Programming to Markov Decision Processes (MDPs), present more than one optimal solution. The study of this non-uniqueness is of great mathematical interest. In this paper the authors show that in a specific family of discounted MDPs, non-uniqueness is a “fragile” property through Ekeland's Principle for each problem with at least two optimal policies; a perturbed model is produced with a unique optimal policy. This result not only supersedes...

Mean-variance optimality for semi-Markov decision processes under first passage criteria

Xiangxiang Huang, Yonghui Huang (2017)

Kybernetika

Similarity:

This paper deals with a first passage mean-variance problem for semi-Markov decision processes in Borel spaces. The goal is to minimize the variance of a total discounted reward up to the system's first entry to some target set, where the optimization is over a class of policies with a prescribed expected first passage reward. The reward rates are assumed to be possibly unbounded, while the discount factor may vary with states of the system and controls. We first develop some suitable...

A two-disorder detection problem

Krzysztof Szajowski (1997)

Applicationes Mathematicae

Similarity:

Suppose that the process X = { X n , n } is observed sequentially. There are two random moments of time θ 1 and θ 2 , independent of X, and X is a Markov process given θ 1 and θ 2 . The transition probabilities of X change for the first time at time θ 1 and for the second time at time θ 2 . Our objective is to find a strategy which immediately detects the distribution changes with maximal probability based on observation of X. The corresponding problem of double optimal stopping is constructed. The optimal strategy...

Identification of optimal policies in Markov decision processes

Karel Sladký (2010)

Kybernetika

Similarity:

In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the modified value iterations it is possible to eliminate suboptimal actions and to identify an optimal...

Deterministic optimal policies for Markov control processes with pathwise constraints

Armando F. Mendoza-Pérez, Onésimo Hernández-Lerma (2012)

Applicationes Mathematicae

Similarity:

This paper deals with discrete-time Markov control processes in Borel spaces with unbounded rewards. Under suitable hypotheses, we show that a randomized stationary policy is optimal for a certain expected constrained problem (ECP) if and only if it is optimal for the corresponding pathwise constrained problem (pathwise CP). Moreover, we show that a certain parametric family of unconstrained optimality equations yields convergence properties that lead to an approximation scheme which...