Displaying similar documents to “Estimation of hidden Markov models for a partially observed risk sensitive control problem”

Recursive self-tuning control of finite Markov chains

Vivek Borkar (1997)

Applicationes Mathematicae

Similarity:

A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.

Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost

V. Borkar, S. Associate (1998)

Applicationes Mathematicae

Similarity:

This paper considers Bayesian parameter estimation and an associated adaptive control scheme for controlled Markov chains and diffusions with time-averaged cost. Asymptotic behaviour of the posterior law of the parameter given the observed trajectory is analyzed. This analysis suggests a "cost-biased" estimation scheme and associated self-tuning adaptive control. This is shown to be asymptotically optimal in the almost sure sense.

Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion

Evgueni I. Gordienko, J. Adolfo Minjárez-Sosa (1998)

Kybernetika

Similarity:

We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations x t + 1 = F ( x t , a t , ξ t ) , t = 0 , 1 , ... with i.i.d. k -valued random vectors ξ t whose density ρ is unknown. Assuming observability of ξ t we propose the procedure of statistical estimation of ρ that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded...