# Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost

Applicationes Mathematicae (1998)

- Volume: 25, Issue: 3, page 339-358
- ISSN: 1233-7234

## Access Full Article

top## Abstract

top## How to cite

topBorkar, V., and Associate, S.. "Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost." Applicationes Mathematicae 25.3 (1998): 339-358. <http://eudml.org/doc/219208>.

@article{Borkar1998,

abstract = {This paper considers Bayesian parameter estimation and an associated adaptive control scheme for controlled Markov chains and diffusions with time-averaged cost. Asymptotic behaviour of the posterior law of the parameter given the observed trajectory is analyzed. This analysis suggests a "cost-biased" estimation scheme and associated self-tuning adaptive control. This is shown to be asymptotically optimal in the almost sure sense.},

author = {Borkar, V., Associate, S.},

journal = {Applicationes Mathematicae},

keywords = {time-averaged cost; adaptive control; asymptotic optimality; cost-biased estimate; Bayesian estimation},

language = {eng},

number = {3},

pages = {339-358},

title = {Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost},

url = {http://eudml.org/doc/219208},

volume = {25},

year = {1998},

}

TY - JOUR

AU - Borkar, V.

AU - Associate, S.

TI - Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost

JO - Applicationes Mathematicae

PY - 1998

VL - 25

IS - 3

SP - 339

EP - 358

AB - This paper considers Bayesian parameter estimation and an associated adaptive control scheme for controlled Markov chains and diffusions with time-averaged cost. Asymptotic behaviour of the posterior law of the parameter given the observed trajectory is analyzed. This analysis suggests a "cost-biased" estimation scheme and associated self-tuning adaptive control. This is shown to be asymptotically optimal in the almost sure sense.

LA - eng

KW - time-averaged cost; adaptive control; asymptotic optimality; cost-biased estimate; Bayesian estimation

UR - http://eudml.org/doc/219208

ER -

## References

top- [1] R. Agrawal, D. Teneketzis and V. Anantharam, Asymptotically efficient adaptive allocation schemes for controlled Markov chains: finite parameter space, IEEE Trans. Automatic Control AC-34 (1989), 1249-1259. Zbl0689.93039
- [2] A. Barron, Are Bayes rules consistent in information?, in: Problems in Communication and Computation, T. M. Cover and B. Gopinath (eds.), Springer, New York, 1987, 85-91.
- [3] R. N. Bhattacharya, Asymptotic behaviour of several dimensional diffusions, in: Stochastic Nonlinear Systems, L. Arnold and R. Lefever (eds.), Springer, New York, 1981, 86-91.
- [4] D. Blackwell and L. Dubins, Merging of opinions with increasing information, Ann. Math. Statist. 33 (1962), 882-887. Zbl0109.35704
- [5] V. S. Borkar, Control of Markov chains with long run average cost criterion, in: Stochastic Differential Systems, Stochastic Control Theory and Applications, W. H. Fleming and P. L. Lions (eds.), Springer, New York, 1987, 57-77.
- [6] V. S. Borkar, The Kumar-Becker-Lin scheme revisited, J. Optim. Theory Appl. 66 (1990), 289-309. Zbl0682.93060
- [7] V. S. Borkar, Self-tuning control of diffusions without the identifiability condition, ibid. 68 (1991), 117-137. Zbl0697.93036
- [8] V. S. Borkar, On the Milito-Cruz adaptive control scheme for Markov chains, ibid. 77 (1993), 387-397. Zbl0791.93055
- [9] V. S. Borkar, A modified self-tuner for controlled diffusions with an unknown parameter, in: Mathematical Theory of Control (Bombay, 1990), A. V. Balakrishnan and M. C. Joshi (eds.), Marcel Dekker, 1992, 57-67. Zbl0790.93082
- [10] V. S. Borkar and M. K. Ghosh, Ergodic and adaptive control of nearest neighbour motions, Math. Control Signals and Systems 4 (1991), 81-98. Zbl0736.93078
- [11] V. S. Borkar and M. K. Ghosh, Ergodic control of multidimensional diffusions II: adaptive control, Appl. Math. Optim. 21 (1990), 191-220. Zbl0691.93027
- [12] V. S. Borkar and P. P. Varaiya, Identification and adaptive control of Markov chains I: finite parameter case, IEEE Trans. Automatic Control 24 (1979), 953-957. Zbl0416.93065
- [13] V. S. Borkar and P. P. Varaiya, Identification and adaptive control of Markov chains, SIAM J. Control Optim. 20 (1982), 470-488. Zbl0491.93063
- [14] E. K. P. Chong and P. J. Ramadge, Stochastic optimization of regenerative systems using infinitesimal perturbation analysis, IEEE Trans. Automatic Control 39 (1994), 1400-1410. Zbl0806.93058
- [15] Y. S. Chow and H. Teicher, Probability Theory: Independence, Interchangeability, Martingales, Springer, New York, 1979.
- [16] G. B. Di Masi and Ł. Stettner, Bayesian ergodic adaptive control of discrete time Markov processes, Stochastics Stochastic Reports 54 (1995), 301-316. Zbl0855.93103
- [17] B. Doshi and S. E. Shreve, Randomized self-tuning control of Markov chains, J. Appl. Probab. 17 (1980), 726-734. Zbl0442.93054
- [18] B. Hajek, Hitting-time and occupation-time bounds implied by drift analysis with applications, Adv. Appl. Probab. 14 (1982), 502-525. Zbl0495.60094
- [19] P. R. Kumar and A. Becker, A new family of optimal adaptive controllers for Markov chains, IEEE Trans. Automatic Control 27 (1982), 137-142. Zbl0471.93069
- [20] P. R. Kumar and W. Lin, Optimal adaptive controllers for Markov chains, ibid. 27 (1982), 756-774. Zbl0488.93036
- [21] P. R. Kumar and P. P. Varaiya, Stochastic Systems--Estimation, Identification and Adaptive Control, Prentice-Hall, 1986. Zbl0706.93057
- [22] P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60. Zbl0281.60070
- [23] R. Milito and J. B. Cruz, Jr., An optimization oriented approach to adaptive control of Markov chains, IEEE Trans. Automatic Control 32 (1987), 754-762. Zbl0632.93080
- [24] J. N. Tsitsiklis, Asynchronous stochastic approaximation and Q-learning, Machine Learning 16 (1994), 195-202. Zbl0820.68105
- [25] K. Van Hee, Bayesian Control of Markov Chains, Math. Center Tracts, 95, Math. Center, Amsterdam, 1978.

## NotesEmbed ?

topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.