Displaying similar documents to “Exact solution of the Bellman equation for a β -discounted reward in a two-armed bandit with switching arms.”

Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko, Francisco Salem-Silva (2000)

Kybernetika

Similarity:

For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend...