Empirical approximation in Markov games under unbounded payoff: discounted and average criteria
This work deals with a class of discrete-time zero-sum Markov games whose state process evolves according to the equation where and represent the actions of player 1 and 2, respectively, and is a sequence of independent and identically distributed random variables with unknown distribution . Assuming possibly unbounded payoff, and using the empirical distribution to estimate , we introduce approximation schemes for the value of the game as well as for optimal strategies considering both,...