# Calculating the variance in Markov-processes with random reward.

Trabajos de Estadística e Investigación Operativa (1982)

- Volume: 33, Issue: 3, page 73-85
- ISSN: 0041-0241

## Access Full Article

top## Abstract

top## How to cite

topBenito, Francisco. "Calculating the variance in Markov-processes with random reward.." Trabajos de Estadística e Investigación Operativa 33.3 (1982): 73-85. <http://eudml.org/doc/40668>.

@article{Benito1982,

abstract = {In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and with or without discounting.Application with risk sensitivity policies are possible; this is illustrated in a numerical example where the results are revalidated by simulation.},

author = {Benito, Francisco},

journal = {Trabajos de Estadística e Investigación Operativa},

keywords = {Programación dinámica; Procesos de decisión; Proceso de Markov; random reward; finite Markov decision processes; mean; variance; finite- horizon; total discounted reward; infinite horizon},

language = {eng},

number = {3},

pages = {73-85},

title = {Calculating the variance in Markov-processes with random reward.},

url = {http://eudml.org/doc/40668},

volume = {33},

year = {1982},

}

TY - JOUR

AU - Benito, Francisco

TI - Calculating the variance in Markov-processes with random reward.

JO - Trabajos de Estadística e Investigación Operativa

PY - 1982

VL - 33

IS - 3

SP - 73

EP - 85

AB - In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and with or without discounting.Application with risk sensitivity policies are possible; this is illustrated in a numerical example where the results are revalidated by simulation.

LA - eng

KW - Programación dinámica; Procesos de decisión; Proceso de Markov; random reward; finite Markov decision processes; mean; variance; finite- horizon; total discounted reward; infinite horizon

UR - http://eudml.org/doc/40668

ER -