Bounds on discrete dynamic programming recursions. II. Polynomial bounds on problems with block-triangular structure
In this note we focus attention on characterizations of policies maximizing growth rate of expected utility, along with average of the associated certainty equivalent, in risk-sensitive Markov decision chains with finite state and action spaces. In contrast to the existing literature the problem is handled by methods of stochastic dynamic programming on condition that the transition probabilities are replaced by general nonnegative matrices. Using the block-triangular decomposition of a collection...
In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the modified value iterations it is possible to eliminate suboptimal actions and to identify an optimal policy...
In this note attention is focused on finding policies optimizing risk-sensitive optimality criteria in Markov decision chains. To this end we assume that the total reward generated by the Markov process is evaluated by an exponential utility function with a given risk-sensitive coefficient. The ratio of the first two moments depends on the value of the risk-sensitive coefficient; if the risk-sensitive coefficient is equal to zero we speak on risk-neutral models. Observe that the first moment of...
The article is devoted to Markov reward chains in discrete-time setting with finite state spaces. Unfortunately, the usual optimization criteria examined in the literature on Markov decision chains, such as a total discounted, total reward up to reaching some specific state (called the first passage models) or mean (average) reward optimality, may be quite insufficient to characterize the problem from the point of a decision maker. To this end it seems that it may be preferable if not necessary...
This second Part II, which follows a first Part I for the discrete-time case (see [DijkSl1]), deals with monotonicity and comparison results, as generalization of the pure stochastic case, for stochastic dynamic systems with arbitrary nonnegative generators in the continuous-time case. In contrast with the discrete-time case the generalization is no longer straightforward. A discrete-time transformation will therefore be developed first. Next, results from Part I can be adopted. The conditions,...
In two subsequent parts, Part I and II, monotonicity and comparison results will be studied, as generalization of the pure stochastic case, for arbitrary dynamic systems governed by nonnegative matrices. Part I covers the discrete-time and Part II the continuous-time case. The research has initially been motivated by a reliability application contained in Part II. In the present Part I it is shown that monotonicity and comparison results, as known for Markov chains, do carry over rather smoothly...
Page 1