Approximation and estimation in Markov control processes under a discounted criterion
We consider a class of discrete-time Markov control processes with Borel state and action spaces, and -valued i.i.d. disturbances with unknown density Supposing possibly unbounded costs, we combine suitable density estimation methods of with approximation procedures of the optimal cost function, to show the existence of a sequence of minimizers converging to an optimal stationary policy