Displaying 21 – 40 of 132

Showing per page

Approximation and estimation in Markov control processes under a discounted criterion

J. Adolfo Minjárez-Sosa (2004)

Kybernetika

We consider a class of discrete-time Markov control processes with Borel state and action spaces, and k -valued i.i.d. disturbances with unknown density ρ . Supposing possibly unbounded costs, we combine suitable density estimation methods of ρ with approximation procedures of the optimal cost function, to show the existence of a sequence { f ^ t } of minimizers converging to an optimal stationary policy f .

Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion

Juan González-Hernández, Raquiel R. López-Martínez, J. Adolfo Minjárez-Sosa (2009)

Kybernetika

The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process x t and the discount process α t evolve according to the coupled difference equations x t + 1 = F ( x t , α t , a t , ξ t ) , α ...

Asymptotic properties and optimization of some non-Markovian stochastic processes

Evgueni I. Gordienko, Antonio Garcia, Juan Ruiz de Chavez (2009)

Kybernetika

We study the limit behavior of certain classes of dependent random sequences (processes) which do not possess the Markov property. Assuming these processes depend on a control parameter we show that the optimization of the control can be reduced to a problem of nonlinear optimization. Under certain hypotheses we establish the stability of such optimization problems.

Average cost Markov control processes with weighted norms: existence of canonical policies

Evgueni Gordienko, Onésimo Hernández-Lerma (1995)

Applicationes Mathematicae

This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.

Average cost Markov control processes with weighted norms: value iteration

Evgueni Gordienko, Onésimo Hernández-Lerma (1995)

Applicationes Mathematicae

This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of forecast horizons.

Bayesian estimation of the mean holding time in average semi-Markov control processes

J. Adolfo Minjárez-Sosa, José A. Montoya (2015)

Applicationes Mathematicae

We consider semi-Markov control models with Borel state and action spaces, possibly unbounded costs, and holding times with a generalized exponential distribution with unknown mean θ. Assuming that such a distribution does not depend on the state-action pairs, we introduce a Bayesian estimation procedure for θ, which combined with a variant of the vanishing discount factor approach yields average cost optimal policies.

Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost

V. Borkar, S. Associate (1998)

Applicationes Mathematicae

This paper considers Bayesian parameter estimation and an associated adaptive control scheme for controlled Markov chains and diffusions with time-averaged cost. Asymptotic behaviour of the posterior law of the parameter given the observed trajectory is analyzed. This analysis suggests a "cost-biased" estimation scheme and associated self-tuning adaptive control. This is shown to be asymptotically optimal in the almost sure sense.

Bottom-up learning of hierarchical models in a class of deterministic POMDP environments

Hideaki Itoh, Hisao Fukumoto, Hiroshi Wakuya, Tatsuya Furukawa (2015)

International Journal of Applied Mathematics and Computer Science

The theory of partially observable Markov decision processes (POMDPs) is a useful tool for developing various intelligent agents, and learning hierarchical POMDP models is one of the key approaches for building such agents when the environments of the agents are unknown and large. To learn hierarchical models, bottom-up learning methods in which learning takes place in a layer-by-layer manner from the lowest to the highest layer are already extensively used in some research fields such as hidden...

Calculating the variance in Markov-processes with random reward.

Francisco Benito (1982)

Trabajos de Estadística e Investigación Operativa

In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and...

Colored decision process Petri nets: modeling, analysis and stability

Julio Clempner (2005)

International Journal of Applied Mathematics and Computer Science

In this paper we introduce a new modeling paradigm for developing a decision process representation called the Colored Decision Process Petri Net (CDPPN). It extends the Colored Petri Net (CPN) theoretic approach including Markov decision processes. CPNs are used for process representation taking advantage of the formal semantic and the graphical display. A Markov decision process is utilized as a tool for trajectory planning via a utility function. The main point of the CDPPN is its ability to...

Constrained optimality problem of Markov decision processes with Borel spaces and varying discount factors

Xiao Wu, Yanqiu Tang (2021)

Kybernetika

This paper focuses on the constrained optimality of discrete-time Markov decision processes (DTMDPs) with state-dependent discount factors, Borel state and compact Borel action spaces, and possibly unbounded costs. By means of the properties of so-called occupation measures of policies and the technique of transforming the original constrained optimality problem of DTMDPs into a convex program one, we prove the existence of an optimal randomized stationary policies under reasonable conditions.

Contrôle dynamique de flux dans un système d'attente avec panne

A. Haqiq, N. Mikou (2010)

RAIRO - Operations Research

We consider two parallel M/M/1 queues. The server at one of the queues is subject to intermittent breakdowns. By the theory of dynamic programming, we determine a threshold optimal policy which consists to transfer, when it is necessary, the customers that arrive at the first queue towards the second queue in order to minimize an instantaneous cost depending of the two queue lengths.

Deterministic optimal policies for Markov control processes with pathwise constraints

Armando F. Mendoza-Pérez, Onésimo Hernández-Lerma (2012)

Applicationes Mathematicae

This paper deals with discrete-time Markov control processes in Borel spaces with unbounded rewards. Under suitable hypotheses, we show that a randomized stationary policy is optimal for a certain expected constrained problem (ECP) if and only if it is optimal for the corresponding pathwise constrained problem (pathwise CP). Moreover, we show that a certain parametric family of unconstrained optimality equations yields convergence properties that lead to an approximation scheme which allows us to...

Discounted Markov control processes induced by deterministic systems

Hugo Cruz-Suárez, Raúl Montes-de-Oca (2006)

Kybernetika

This paper deals with Markov Control Processes (MCPs) on Euclidean spaces with an infinite horizon and a discounted total cost. Firstly, MCPs which result from the deterministic controlled systems will be analyzed. For such MCPs, conditions that permit to establish the equation known in the literature of Economy as Euler’s Equation (EE) will be given. There will be also presented an example of a Markov Control Process with deterministic controlled system where, to obtain the optimal value function,...

Currently displaying 21 – 40 of 132