EuDML | Browse

Items

All a b c d e f g h i j k l m n o p q r s t u v w x y z Other

Page 1 Next

Displaying 1 – 20 of 133

A consumption-investment problem modelled as a discounted Markov decision process

Hugo Cruz-Suárez, Raúl Montes-de-Oca, Gabriel Zacarías (2011)

Kybernetika

In this paper a problem of consumption and investment is presented as a model of a discounted Markov decision process with discrete-time. In this problem, it is assumed that the wealth is affected by a production function. This assumption gives the investor a chance to increase his wealth before the investment. For the solution of the problem there is established a suitable version of the Euler Equation (EE) which characterizes its optimal policy completely, that is, there are provided conditions...

A diffusion approximation in the ruin problem for a controlled Markov chain

Pham Kieu (1974)

Kybernetika

A game model referring to the control of independent discrete time stochastic processes

A. Styszyński (1983)

Applicationes Mathematicae

A generalized Markov decision process

Gary J. Koehler (1980)

RAIRO - Operations Research - Recherche Opérationnelle

A Markov chain model for traffic equilibrium problems

Giandomenico Mastroeni (2002)

RAIRO - Operations Research - Recherche Opérationnelle

We consider a stochastic approach in order to define an equilibrium model for a traffic-network problem. In particular, we assume a markovian behaviour of the users in their movements throughout the zones of the traffic area. This assumption turns out to be effective at least in the context of urban traffic, where, in general, the users tend to travel by choosing the path they find more convenient and not necessarily depending on the already travelled part. The developed model is a homogeneous Markov...

A Markov chain model for traffic equilibrium problems

Giandomenico Mastroeni (2010)

RAIRO - Operations Research

We consider a stochastic approach in order to define an equilibrium model for a traffic-network problem. In particular, we assume a Markovian behaviour of the users in their movements throughout the zones of the traffic area. This assumption turns out to be effective at least in the context of urban traffic, where, in general, the users tend to travel by choosing the path they find more convenient and not necessarily depending on the already travelled part. The developed model is a homogeneous...

A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs

Óscar Vega-Amaya, Joaquín López-Borbón (2019)

Kybernetika

The present paper studies the approximate value iteration (AVI) algorithm for the average cost criterion with bounded costs and Borel spaces. It is shown the convergence of the algorithm and provided a performance bound assuming that the model satisfies a standard continuity-compactness assumption and a uniform ergodicity condition. This is done for the class of approximation procedures that can be represented by linear positive operators which give exact representation of constant functions and...

A semimartingale characterization of average optimal stationary policies for Markov decision processes.

Zhu, Quanxin, Guo, Xianping (2006)

Journal of Applied Mathematics and Stochastic Analysis

A separation theorem for expected value and feared value discrete time control

Pierre Bernhard (1996)

ESAIM: Control, Optimisation and Calculus of Variations

A Separation Theorem for Expected Value and Feared Value Discrete Time Control

Pierre Bernhard (2010)

ESAIM: Control, Optimisation and Calculus of Variations

We show how the use of a parallel between the ordinary (+, X) and the (max, +) algebras, Maslov measures that exploit this parallel, and more specifically their specialization to probabilities and the corresponding cost measures of Quadrat, offer a completely parallel treatment of stochastic and minimax control of disturbed nonlinear discrete time systems with partial information. This paper is based upon, and improves, the discrete time part of the earlier paper [9].

A stopping rule for discounted Markov decision processes with finite action sets

Raúl Montes-de-Oca, Enrique Lemus-Rodríguez, Daniel Cruz-Suárez (2009)

Kybernetika

In a Discounted Markov Decision Process (DMDP) with finite action sets the Value Iteration Algorithm, under suitable conditions, leads to an optimal policy in a finite number of steps. Determining an upper bound on the necessary number of steps till gaining convergence is an issue of great theoretical and practical interest as it would provide a computationally feasible stopping rule for value iteration as an algorithm for finding an optimal policy. In this paper we find such a bound depending only...

A vanishing discount limit theorem for controlled Markov chains

Monika Laušmanová (1989)

Kybernetika

About stability of risk-seeking optimal stopping

Raúl Montes-de-Oca, Elena Zaitseva (2014)

Kybernetika

We offer the quantitative estimation of stability of risk-sensitive cost optimization in the problem of optimal stopping of Markov chain on a Borel space $X$ . It is supposed that the transition probability $p (\cdot | x)$ , $x \in X$ is approximated by the transition probability $\tilde{p} (\cdot | x)$ , $x \in X$ , and that the stopping rule ${\tilde{f}}_{*}$ , which is optimal for the process with the transition probability $\tilde{p}$ is applied to the process with the transition probability $p$ . We give an upper bound (expressed in term of the total variation distance: ${sup}_{x \in X} ∥ p (\cdot | x) - \tilde{p} (\cdot | x) ∥)$ for...

Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion

Evgueni I. Gordienko, J. Adolfo Minjárez-Sosa (1998)

Kybernetika

We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations $x_{t + 1} = F (x_{t}, a_{t}, ξ_{t}), t = 0, 1, ...$ with i.i.d. $ℜ^{k}$ -valued random vectors $ξ_{t}$ whose density $ρ$ is unknown. Assuming observability of $ξ_{t}$ we propose the procedure of statistical estimation of $ρ$ that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded costs.

Admission controls for Erlang's loss system with service times distributed as a finite sum of exponential random variables.

Moretta, Brian, Ziedins, Ilze (1998)

Journal of Applied Mathematics and Decision Sciences

An extended version of average Markov decision processes on discrete spaces under fuzzy environment

Hugo Cruz-Suárez, Raúl Montes-de-Oca, R. Israel Ortega-Gutiérrez (2023)

Kybernetika

The article presents an extension of the theory of standard Markov decision processes on discrete spaces and with the average cost as the objective function which permits to take into account a fuzzy average cost of a trapezoidal type. In this context, the fuzzy optimal control problem is considered with respect to two cases: the max-order of the fuzzy numbers and the average ranking order of the trapezoidal fuzzy numbers. Each of these cases extends the standard optimal control problem, and for...

An optimality system for finite average Markov decision chains under risk-aversion

Alfredo Alanís-Durán, Rolando Cavazos-Cadena (2012)

Kybernetika

This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the...

An SMDP model for a multiclass multi-server queueing control problem considering conversion times

Zhicong Zhang, Na Li, Shuai Li, Xiaohui Yan, Jianwen Guo (2014)

RAIRO - Operations Research - Recherche Opérationnelle

We address a queueing control problem considering service times and conversion times following normal distributions. We formulate the multi-server queueing control problem by constructing a semi-Markov decision process (SMDP) model. The mechanism of state transitions is developed through mathematical derivation of the transition probabilities and transition times. We also study the property of the queueing control system and show that optimizing the objective function of the addressed queueing control...

An unbounded Berge's minimum theorem with applications to discounted Markov decision processes

Raúl Montes-de-Oca, Enrique Lemus-Rodríguez (2012)

Kybernetika

This paper deals with a certain class of unbounded optimization problems. The optimization problems taken into account depend on a parameter. Firstly, there are established conditions which permit to guarantee the continuity with respect to the parameter of the minimum of the optimization problems under consideration, and the upper semicontinuity of the multifunction which applies each parameter into its set of minimizers. Besides, with the additional condition of uniqueness of the minimizer, its...

Another set of verifiable conditions for average Markov decision processes with Borel spaces

Xiaolong Zou, Xianping Guo (2015)

Kybernetika

In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two...

Currently displaying 1 – 20 of 133

Page 1 Next