Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs
Beatris A. Escobedo-Trujillo, Carmen G. Higuera-Chan (2019)
Kybernetika
Similarity:
In this paper we are concerned with a class of time-varying discounted Markov decision models with unbounded costs and state-action dependent discount factors. Specifically we study controlled systems whose state process evolves according to the equation , with state-action dependent discount factors of the form , where and are the control and the random disturbance at time , respectively. Assuming that the sequences of functions , and converge, in certain sense, to ,...