EuDML | Browse

Items

All a b c d e f g h i j k l m n o p q r s t u v w x y z Other

Previous Page 7

Displaying 121 – 133 of 133

The expected discounted reward from a Markov replacement process

Pavla Kunderová (1985)

Acta Universitatis Palackianae Olomucensis. Facultas Rerum Naturalium. Mathematica

The exponential cost optimality for finite horizon semi-Markov decision processes

Haifeng Huo, Xian Wen (2022)

Kybernetika

This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative...

The risk-sensitive Poisson equation for a communicating Markov chain on a denumerable state space

Rolando Cavazos-Cadena (2009)

Kybernetika

This work concerns a discrete-time Markov chain with time-invariant transition mechanism and denumerable state space, which is endowed with a nonnegative cost function with finite support. The performance of the chain is measured by the (long-run) risk-sensitive average cost and, assuming that the state space is communicating, the existence of a solution to the risk-sensitive Poisson equation is established, a result that holds even for transient chains. Also, a sufficient criterion ensuring that...

Théorie des processus de production

Claude Dellacherie (1990)

Séminaire de probabilités de Strasbourg

Three different operations research models for the same $(s, S)$ policy.

Ben-Ayed, Omar (2001)

Journal of Applied Mathematics and Decision Sciences

Time-discretization for controlled Markov processes. I. General approximation results

Nico M. van Dijk, Arie Hordijk (1996)

Kybernetika

Time-discretization for controlled Markov processes. II. A jump and diffusion application

Nico M. van Dijk, Arie Hordijk (1996)

Kybernetika

Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs

Beatris A. Escobedo-Trujillo, Carmen G. Higuera-Chan (2019)

Kybernetika

In this paper we are concerned with a class of time-varying discounted Markov decision models $ℳ_{n}$ with unbounded costs $c_{n}$ and state-action dependent discount factors. Specifically we study controlled systems whose state process evolves according to the equation $x_{n + 1} = G_{n} (x_{n}, a_{n}, ξ_{n}), n = 0, 1, ...$ , with state-action dependent discount factors of the form $α_{n} (x_{n}, a_{n})$ , where $a_{n}$ and $ξ_{n}$ are the control and the random disturbance at time $n$ , respectively. Assuming that the sequences of functions ${α_{n}}$ , ${c_{n}}$ and ${G_{n}}$ converge, in certain sense, to $α_{\infty}$ , $c_{\infty}$ and $G_{\infty}$ , our...

Transient phenomena and self-optimizing control of Markov chains

Petr Mandl, Gerhard Hübner (1985)

Acta Universitatis Carolinae. Mathematica et Physica

Uniform value in dynamic programming

Jérôme Renault (2011)

Journal of the European Mathematical Society

We consider dynamic programming problems with a large time horizon, and give sufficient conditions for the existence of the uniform value. As a consequence, we obtain an existence result when the state space is precompact, payoffs are uniformly continuous and the transition correspondence is non expansive. In the same spirit, we give an existence result for the limit value. We also apply our results to Markov decision processes and obtain a few generalizations of existing results.

Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach

R. Israel Ortega-Gutiérrez, Raúl Montes-de-Oca, Enrique Lemus-Rodríguez (2016)

Kybernetika

Many examples in optimization, ranging from Linear Programming to Markov Decision Processes (MDPs), present more than one optimal solution. The study of this non-uniqueness is of great mathematical interest. In this paper the authors show that in a specific family of discounted MDPs, non-uniqueness is a “fragile” property through Ekeland's Principle for each problem with at least two optimal policies; a perturbed model is produced with a unique optimal policy. This result not only supersedes previous...

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Rolando Cavazos-Cadena (1989)

Kybernetika

Zum Problem des zweiarmigen Bernoulli-Banditen mit einer bekannten Erfolgswahrscheinlichkeit und unendliche vielen Spielen.

D. Kalin (1982)

Metrika

Currently displaying 121 – 133 of 133

Previous Page 7