EUDML

In this note we focus attention on characterizations of policies maximizing growth rate of expected utility, along with average of the associated certainty equivalent, in risk-sensitive Markov decision chains with finite state and action spaces. In contrast to the existing literature the problem is handled by methods of stochastic dynamic programming on condition that the transition probabilities are replaced by general nonnegative matrices. Using the block-triangular decomposition of a collection...

Identification of optimal policies in Markov decision processes

Karel Sladký — 2010

Kybernetika

In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the modified value iterations it is possible to eliminate suboptimal actions and to identify an optimal policy...

Risk-sensitive average optimality in Markov decision processes

Karel Sladký — 2018

Kybernetika

In this note attention is focused on finding policies optimizing risk-sensitive optimality criteria in Markov decision chains. To this end we assume that the total reward generated by the Markov process is evaluated by an exponential utility function with a given risk-sensitive coefficient. The ratio of the first two moments depends on the value of the risk-sensitive coefficient; if the risk-sensitive coefficient is equal to zero we speak on risk-neutral models. Observe that the first moment of...

Second Order optimality in Markov decision chains

Karel Sladký — 2017

Kybernetika

The article is devoted to Markov reward chains in discrete-time setting with finite state spaces. Unfortunately, the usual optimization criteria examined in the literature on Markov decision chains, such as a total discounted, total reward up to reaching some specific state (called the first passage models) or mean (average) reward optimality, may be quite insufficient to characterize the problem from the point of a decision maker. To this end it seems that it may be preferable if not necessary...

Monotonicity and comparison results for nonnegative dynamic systems. Part II: Continuous-time case

Nico M. van Dijk; Karel Sladký — 2006

Kybernetika

This second Part II, which follows a first Part I for the discrete-time case (see [DijkSl1]), deals with monotonicity and comparison results, as generalization of the pure stochastic case, for stochastic dynamic systems with arbitrary nonnegative generators in the continuous-time case. In contrast with the discrete-time case the generalization is no longer straightforward. A discrete-time transformation will therefore be developed first. Next, results from Part I can be adopted. The conditions,...

Editorial: Sixty years of cybernetics

Lucie Fajfrová; Milan Mareš; Karel Sladký — 2008

Kybernetika

Monotonicity and comparison results for nonnegative dynamic systems. Part I: Discrete-time case

Nico M. van Dijk; Karel Sladký — 2006

Kybernetika

In two subsequent parts, Part I and II, monotonicity and comparison results will be studied, as generalization of the pure stochastic case, for arbitrary dynamic systems governed by nonnegative matrices. Part I covers the discrete-time and Part II the continuous-time case. The research has initially been motivated by a reliability application contained in Part II. In the present Part I it is shown that monotonicity and comparison results, as known for Markov chains, do carry over rather smoothly...

Page 1

Download Results (CSV)

Advanced Search

Formula preview

Currently displaying 1 – 18 of 18

Bounds on discrete dynamic programming recursions. II. Polynomial bounds on problems with block-triangular structure

Poznámka o programování v procesech markovského typu

Special issue: Nondifferentiable Problems in Optimal Design

O metodě postupných aproximací pro nalezení optimálního řízení markovského řetězce

O Eatonově-Zadehově metodě

Jedna úloha optimální obsluhy několika výrobních procesů s náhodnými parametry

On the existence of stationary optimal policies in discrete dynamic programming

Necessary and sufficient optimality conditions for average reward of controlled Markov chains

On sufficient conditions for the stability of dynamic interval systems

Bounds on discrete dynamic programming recursions. I. Models with non-negative matrices

On the set of optimal controls for Markov chains with rewards

Growth rates and average optimality in risk-sensitive Markov decision chains

Identification of optimal policies in Markov decision processes

Risk-sensitive average optimality in Markov decision processes

Second Order optimality in Markov decision chains

Monotonicity and comparison results for nonnegative dynamic systems. Part II: Continuous-time case

Editorial: Sixty years of cybernetics

Monotonicity and comparison results for nonnegative dynamic systems. Part I: Discrete-time case