Recursive self-tuning control of finite Markov chains

Vivek Borkar

Applicationes Mathematicae (1997)

  • Volume: 24, Issue: 2, page 169-188
  • ISSN: 1233-7234

Abstract

top
A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.

How to cite

top

Borkar, Vivek. "Recursive self-tuning control of finite Markov chains." Applicationes Mathematicae 24.2 (1997): 169-188. <http://eudml.org/doc/219160>.

@article{Borkar1997,
abstract = {A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.},
author = {Borkar, Vivek},
journal = {Applicationes Mathematicae},
keywords = {controlled Markov chains; stochastic approximation; relative value iteration; self-tuning control; adaptive control; recursive self-tuning control scheme; finite Markov chains; stochastic approximation scheme},
language = {eng},
number = {2},
pages = {169-188},
title = {Recursive self-tuning control of finite Markov chains},
url = {http://eudml.org/doc/219160},
volume = {24},
year = {1997},
}

TY - JOUR
AU - Borkar, Vivek
TI - Recursive self-tuning control of finite Markov chains
JO - Applicationes Mathematicae
PY - 1997
VL - 24
IS - 2
SP - 169
EP - 188
AB - A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.
LA - eng
KW - controlled Markov chains; stochastic approximation; relative value iteration; self-tuning control; adaptive control; recursive self-tuning control scheme; finite Markov chains; stochastic approximation scheme
UR - http://eudml.org/doc/219160
ER -

References

top
  1. [1] D. Bertsekas, Dynamic Programming--Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, N.J., 1987. Zbl0649.93001
  2. [2] V. S. Borkar, Identification and adaptive control of Markov chains, Ph.D. Thesis, Dept. of Electrical Engrg. and Computer Science, Univ. of California, Berkeley, 1980. Zbl0491.93063
  3. [3] V. S. Borkar, Topics in Controlled Markov Chains, Pitman Res. Notes in Math. 240, Longman Scientific and Technical, Harlow, 1991. Zbl0725.93082
  4. [4] V. S. Borkar, The Kumar-Becker-Lin scheme revisited, J. Optim. Theory Appl. 66 (1990), 289-309. Zbl0682.93060
  5. [5] V. S. Borkar, On Milito-Cruz adaptive control scheme for Markov chains, ibid. 77 (1993), 385-393. Zbl0791.93055
  6. [6] V. S. Borkar and K. Soumyanath, A new analog parallel scheme for fixed point computation I--theory, submitted. Zbl0953.65038
  7. [7] V. S. Borkar and P. P. Varaiya, Adaptive control of Markov chains I: finite parameter case, IEEE Trans. Automat. Control AC-24 (1979), 953-957. Zbl0416.93065
  8. [8] V. S. Borkar and P. P. Varaiya, Identification and adaptive control of Markov chains, SIAM J. Control Optim. 20 (1982), 470-488. Zbl0491.93063
  9. [9] Y.-S. Chow and H. Teicher, Probability Theory: Independence, Interchangeability, Martingales, Springer, New York, 1979. 
  10. [10] B. Doshi and S. Shreve, Randomized self-tuning control of Markov chains, J. Appl. Probab. 17 (1980), 726-734. Zbl0442.93054
  11. [11] Y. El Fattah, Recursive algorithms for adaptive control of finite Markov chains, IEEE Trans. Systems Man Cybernet. SMC-11 (1981), 135-144. 
  12. [12] --, Gradient approach for recursive estimation and control in finite Markov chains, Adv. Appl. Probab. 13 (1981), 778-803. Zbl0475.60051
  13. [13] M. Hirsch, Convergent activation dynamics in continuous time networks, Neural Networks 2 (1987), 331-349. 
  14. [14] A. Jalali and M. Ferguson, Adaptive control of Markov chains with local updates, Systems Control Lett. 14 (1990), 209-218. Zbl0699.93047
  15. [15] P. R. Kumar and A. Becker, A new family of adaptive optimal controllers for Markov chains, IEEE Trans. Automat. Control AC-27 (1982), 137-142. Zbl0471.93069
  16. [16] P. R. Kumar and W. Lin, Optimal adaptive controllers for Markov chains, ibid., 756-774. Zbl0488.93036
  17. [17] H. Kushner and D. Clark, Stochastic Approximation for Constrained and Unconstrained Systems, Springer, Berlin, 1978. Zbl0381.60004
  18. [18] P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60. Zbl0281.60070
  19. [19] R. Milito and J. B. Cruz Jr., An optimization oriented approach to adaptive control of Markov chains, IEEE Trans. Automat. Control AC-32 (1987), 754-762. Zbl0632.93080
  20. [20] J. Neveu, Discrete-Parameter Martingales, North-Holland, Amsterdam, 1975. 
  21. [21] B. Sagalovsky, Adaptive control and parameter estimation in Markov chains: a linear case, IEEE Trans. Automat. Control AC-27 (1982), 414-417. Zbl0479.93061
  22. [22] Ł. Stettner, On nearly self-optimizing strategies for a discrete-time uniformly ergodic adaptive model, Appl. Math. Optim. 27 (1993), 161-177. Zbl0769.93084
  23. [23] T. Yoshizawa, Stability Theory by Liapunov's Second Method, The Mathematical Society of Japan, 1966. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.