Approximation and adaptive control of Markov processes: Average reward criterion
We consider a class of discrete-time Markov control processes with Borel state and action spaces, and -valued i.i.d. disturbances with unknown density Supposing possibly unbounded costs, we combine suitable density estimation methods of with approximation procedures of the optimal cost function, to show the existence of a sequence of minimizers converging to an optimal stationary policy
The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process and the discount process evolve according to the coupled difference equations
We study the limit behavior of certain classes of dependent random sequences (processes) which do not possess the Markov property. Assuming these processes depend on a control parameter we show that the optimization of the control can be reduced to a problem of nonlinear optimization. Under certain hypotheses we establish the stability of such optimization problems.
This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.
This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of forecast horizons.