Calculating the variance in Markov-processes with random reward.
Francisco Benito (1982)
Trabajos de Estadística e Investigación Operativa
Similarity:
In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given. Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited...