Calculating the variance in Markov-processes with random reward.

Francisco Benito

Trabajos de Estadística e Investigación Operativa (1982)

  • Volume: 33, Issue: 3, page 73-85
  • ISSN: 0041-0241

Abstract

top
In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and with or without discounting.Application with risk sensitivity policies are possible; this is illustrated in a numerical example where the results are revalidated by simulation.

How to cite

top

Benito, Francisco. "Calculating the variance in Markov-processes with random reward.." Trabajos de Estadística e Investigación Operativa 33.3 (1982): 73-85. <http://eudml.org/doc/40668>.

@article{Benito1982,
abstract = {In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and with or without discounting.Application with risk sensitivity policies are possible; this is illustrated in a numerical example where the results are revalidated by simulation.},
author = {Benito, Francisco},
journal = {Trabajos de Estadística e Investigación Operativa},
keywords = {Programación dinámica; Procesos de decisión; Proceso de Markov; random reward; finite Markov decision processes; mean; variance; finite- horizon; total discounted reward; infinite horizon},
language = {eng},
number = {3},
pages = {73-85},
title = {Calculating the variance in Markov-processes with random reward.},
url = {http://eudml.org/doc/40668},
volume = {33},
year = {1982},
}

TY - JOUR
AU - Benito, Francisco
TI - Calculating the variance in Markov-processes with random reward.
JO - Trabajos de Estadística e Investigación Operativa
PY - 1982
VL - 33
IS - 3
SP - 73
EP - 85
AB - In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and with or without discounting.Application with risk sensitivity policies are possible; this is illustrated in a numerical example where the results are revalidated by simulation.
LA - eng
KW - Programación dinámica; Procesos de decisión; Proceso de Markov; random reward; finite Markov decision processes; mean; variance; finite- horizon; total discounted reward; infinite horizon
UR - http://eudml.org/doc/40668
ER -

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.