We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations , t=1,2,..., with i.i.d. -valued random vectors , which are observable but whose density ϱ is unknown.
We consider a class of discrete-time Markov control processes with Borel state and action spaces, and -valued i.i.d. disturbances with unknown density Supposing possibly unbounded costs, we combine suitable density estimation methods of with approximation procedures of the optimal cost function, to show the existence of a sequence of minimizers converging to an optimal stationary policy
We consider semi-Markov control models with Borel state and action spaces, possibly unbounded costs, and holding times with a generalized exponential distribution with unknown mean θ. Assuming that such a distribution does not depend on the state-action pairs, we introduce a Bayesian estimation procedure for θ, which combined with a variant of the vanishing discount factor approach yields average cost optimal policies.
We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations with i.i.d. -valued random vectors whose density is unknown. Assuming observability of we propose the procedure of statistical estimation of that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded costs.
We are concerned with a class of queueing systems with controlled service rates, in which the waiting times are only observed when they take zero value. Applying a suitable filtering process, we show the existence of optimal control policies under a discounted optimality criterion.
This work deals with a class of discrete-time zero-sum Markov games whose state process evolves according to the equation where and represent the actions of player 1 and 2, respectively, and is a sequence of independent and identically distributed random variables with unknown distribution . Assuming possibly unbounded payoff, and using the empirical distribution to estimate , we introduce approximation schemes for the value of the game as well as for optimal strategies considering both,...
The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process and the discount process evolve according to the coupled difference equations
where the state and discount disturbance processes and are sequences of i.i.d. random variables with densities and respectively. The main objective is to introduce approximation algorithms of the optimal...
Download Results (CSV)