EUDML

Currently displaying 1 – 11 of 11

Order by Relevance | Title | Year of publication

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Rolando Cavazos-Cadena — 1989

Kybernetika

Solution to the optimality equation in a class of Markov decision chains with the average cost criterion

Rolando Cavazos-Cadena — 1991

Kybernetika

The risk-sensitive Poisson equation for a communicating Markov chain on a denumerable state space

Rolando Cavazos-Cadena — 2009

Kybernetika

This work concerns a discrete-time Markov chain with time-invariant transition mechanism and denumerable state space, which is endowed with a nonnegative cost function with finite support. The performance of the chain is measured by the (long-run) risk-sensitive average cost and, assuming that the state space is communicating, the existence of a solution to the risk-sensitive Poisson equation is established, a result that holds even for transient chains. Also, a sufficient criterion ensuring that...

Generalized communication conditions and the eigenvalue problem for a monotone and homogenous function

Rolando Cavazos-Cadena — 2010

Kybernetika

This work is concerned with the eigenvalue problem for a monotone and homogenous self-mapping $f$ of a finite dimensional positive cone. Paralleling the classical analysis of the (linear) Perron–Frobenius theorem, a verifiable communication condition is formulated in terms of the successive compositions of $f$ , and under such a condition it is shown that the upper eigenspaces of $f$ are bounded in the projective sense, a property that yields the existence of a nonlinear eigenvalue as well as the projective...

An optimality system for finite average Markov decision chains under risk-aversion

Alfredo Alanís-Durán; Rolando Cavazos-Cadena — 2012

Kybernetika

This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the...

Nash Equilibria in a class of Markov stopping games

Rolando Cavazos-Cadena; Daniel Hernández-Hernández — 2012

Kybernetika

This work concerns a class of discrete-time, zero-sum games with two players and Markov transitions on a denumerable space. At each decision time player II can stop the system paying a terminal reward to player I and, if the system is no halted, player I selects an action to drive the system and receives a running reward from player II. Measuring the performance of a pair of decision strategies by the total expected discounted reward, under standard continuity-compactness conditions it is shown...

Optimal stationary policies inrisk-sensitive dynamic programs with finite state spaceand nonnegative rewards

Rolando Cavazos-Cadena; Raúl Montes-de-Oca — 2000

Applicationes Mathematicae

This work concerns controlled Markov chains with finite state space and nonnegative rewards; it is assumed that the controller has a constant risk-sensitivity, and that the performance ofa control policy is measured by a risk-sensitive expected total-reward criterion. The existence of optimal stationary policies isstudied within this context, and the main resultestablishes the optimalityof a stationary policy achieving the supremum in the correspondingoptimality equation, whenever the associated...

Estimation and control in finite Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena; Raúl Montes-de-Oca — 2004

Applicationes Mathematicae

This work concerns Markov decision chains with finite state and action sets. The transition law satisfies the simultaneous Doeblin condition but is unknown to the controller, and the problem of determining an optimal adaptive policy with respect to the average reward criterion is addressed. A subset of policies is identified so that, when the system evolves under a policy in that class, the frequency estimators of the transition law are consistent on an essential set of admissible state-action pairs,...

Stationary optimal policies in a class of multichain positive dynamic programs with finite state space and risk-sensitive criterion

Rolando Cavazos-Cadena; Raul Montes-de-Oca — 2001

Applicationes Mathematicae

This work concerns Markov decision processes with finite state space and compact action sets. The decision maker is supposed to have a constant-risk sensitivity coefficient, and a control policy is graded via the risk-sensitive expected total-reward criterion associated with nonnegative one-step rewards. Assuming that the optimal value function is finite, under mild continuity and compactness restrictions the following result is established: If the number of ergodic classes when a stationary policy...

Risk-sensitive Markov stopping games with an absorbing state

Jaicer López-Rivero; Rolando Cavazos-Cadena; Hugo Cruz-Suárez — 2022

Kybernetika

This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I....

Markov stopping games with an absorbing state and total reward criterion

Rolando Cavazos-Cadena; Luis Rodríguez-Gutiérrez; Dulce María Sánchez-Guillermo — 2021

Kybernetika

This work is concerned with discrete-time zero-sum games with Markov transitions on a denumerable space. At each decision time player II can stop the system paying a terminal reward to player I, or can let the system to continue its evolution. If the system is not halted, player I selects an action which affects the transitions and receives a running reward from player II. Assuming the existence of an absorbing state which is accessible from any other state, the performance of a pair of decision...

Page 1

Download Results (CSV)

Advanced Search

Formula preview

Currently displaying 1 – 11 of 11

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Solution to the optimality equation in a class of Markov decision chains with the average cost criterion

The risk-sensitive Poisson equation for a communicating Markov chain on a denumerable state space

Generalized communication conditions and the eigenvalue problem for a monotone and homogenous function

An optimality system for finite average Markov decision chains under risk-aversion

Nash Equilibria in a class of Markov stopping games

Optimal stationary policies inrisk-sensitive dynamic programs with finite state spaceand nonnegative rewards

Estimation and control in finite Markov decision processes with the average reward criterion

Stationary optimal policies in a class of multichain positive dynamic programs with finite state space and risk-sensitive criterion

Risk-sensitive Markov stopping games with an absorbing state

Markov stopping games with an absorbing state and total reward criterion