The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

The search session has expired. Please query the service again.

Displaying similar documents to “Exact solution of the Bellman equation for a β -discounted reward in a two-armed bandit with switching arms.”

Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko, Francisco Salem-Silva (2000)

Kybernetika

Similarity:

For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend...