Page 1

Displaying 1 – 5 of 5

Showing per page

Calculating the variance in Markov-processes with random reward.

Francisco Benito (1982)

Trabajos de Estadística e Investigación Operativa

In this article we present a generalization of Markov Decision Processes with discreet time where the immediate rewards in every period are not deterministic but random, with the two first moments of the distribution given.Formulas are developed to calculate the expected value and the variance of the reward of the process, formulas which generalize and partially correct other results. We make some observations about the distribution of rewards for processes with limited or unlimited horizon and...

Colored decision process Petri nets: modeling, analysis and stability

Julio Clempner (2005)

International Journal of Applied Mathematics and Computer Science

In this paper we introduce a new modeling paradigm for developing a decision process representation called the Colored Decision Process Petri Net (CDPPN). It extends the Colored Petri Net (CPN) theoretic approach including Markov decision processes. CPNs are used for process representation taking advantage of the formal semantic and the graphical display. A Markov decision process is utilized as a tool for trajectory planning via a utility function. The main point of the CDPPN is its ability to...

Constrained optimality problem of Markov decision processes with Borel spaces and varying discount factors

Xiao Wu, Yanqiu Tang (2021)

Kybernetika

This paper focuses on the constrained optimality of discrete-time Markov decision processes (DTMDPs) with state-dependent discount factors, Borel state and compact Borel action spaces, and possibly unbounded costs. By means of the properties of so-called occupation measures of policies and the technique of transforming the original constrained optimality problem of DTMDPs into a convex program one, we prove the existence of an optimal randomized stationary policies under reasonable conditions.

Contrôle dynamique de flux dans un système d'attente avec panne

A. Haqiq, N. Mikou (2010)

RAIRO - Operations Research

We consider two parallel M/M/1 queues. The server at one of the queues is subject to intermittent breakdowns. By the theory of dynamic programming, we determine a threshold optimal policy which consists to transfer, when it is necessary, the customers that arrive at the first queue towards the second queue in order to minimize an instantaneous cost depending of the two queue lengths.

Currently displaying 1 – 5 of 5

Page 1