Currently displaying 1 – 6 of 6

Showing per page

Order by Relevance | Title | Year of publication

Approximation and estimation in Markov control processes under a discounted criterion

J. Adolfo Minjárez-Sosa — 2004

Kybernetika

We consider a class of discrete-time Markov control processes with Borel state and action spaces, and k -valued i.i.d. disturbances with unknown density ρ . Supposing possibly unbounded costs, we combine suitable density estimation methods of ρ with approximation procedures of the optimal cost function, to show the existence of a sequence { f ^ t } of minimizers converging to an optimal stationary policy f .

Bayesian estimation of the mean holding time in average semi-Markov control processes

J. Adolfo Minjárez-SosaJosé A. Montoya — 2015

Applicationes Mathematicae

We consider semi-Markov control models with Borel state and action spaces, possibly unbounded costs, and holding times with a generalized exponential distribution with unknown mean θ. Assuming that such a distribution does not depend on the state-action pairs, we introduce a Bayesian estimation procedure for θ, which combined with a variant of the vanishing discount factor approach yields average cost optimal policies.

Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion

We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations x t + 1 = F ( x t , a t , ξ t ) , t = 0 , 1 , ... with i.i.d. k -valued random vectors ξ t whose density ρ is unknown. Assuming observability of ξ t we propose the procedure of statistical estimation of ρ that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded costs.

Empirical approximation in Markov games under unbounded payoff: discounted and average criteria

This work deals with a class of discrete-time zero-sum Markov games whose state process x t evolves according to the equation x t + 1 = F ( x t , a t , b t , ξ t ) , where a t and b t represent the actions of player 1 and 2, respectively, and ξ t is a sequence of independent and identically distributed random variables with unknown distribution θ . Assuming possibly unbounded payoff, and using the empirical distribution to estimate θ , we introduce approximation schemes for the value of the game as well as for optimal strategies considering both,...

Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion

The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process x t and the discount process α t evolve according to the coupled difference equations x t + 1 = F ( x t , α t , a t , ξ t ) , α t + 1 = G ( α t , η t ) where the state and discount disturbance processes { ξ t } and { η t } are sequences of i.i.d. random variables with densities ρ ξ and ρ η respectively. The main objective is to introduce approximation algorithms of the optimal...

Page 1

Download Results (CSV)