Displaying 261 – 280 of 862

Showing per page

Epoch-incremental reinforcement learning algorithms

Roman Zajdel (2013)

International Journal of Applied Mathematics and Computer Science

In this article, a new class of the epoch-incremental reinforcement learning algorithm is proposed. In the incremental mode, the fundamental TD(0) or TD(λ) algorithm is performed and an environment model is created. In the epoch mode, on the basis of the environment model, the distances of past-active states to the terminal state are computed. These distances and the reinforcement terminal state signal are used to improve the agent policy.

Equivalent cost functionals and stochastic linear quadratic optimal control problems

Zhiyong Yu (2013)

ESAIM: Control, Optimisation and Calculus of Variations

This paper is concerned with the stochastic linear quadratic optimal control problems (LQ problems, for short) for which the coefficients are allowed to be random and the cost functionals are allowed to have negative weights on the square of control variables. We propose a new method, the equivalent cost functional method, to deal with the LQ problems. Comparing to the classical methods, the new method is simple, flexible and non-abstract. The new method can also be applied to deal with nonlinear...

Ergodic control of linear stochastic equations in a Hilbert space with fractional Brownian motion

Tyrone E. Duncan, B. Maslowski, B. Pasik-Duncan (2015)

Banach Center Publications

A linear-quadratic control problem with an infinite time horizon for some infinite dimensional controlled stochastic differential equations driven by a fractional Brownian motion is formulated and solved. The feedback form of the optimal control and the optimal cost are given explicitly. The optimal control is the sum of the well known linear feedback control for the associated infinite dimensional deterministic linear-quadratic control problem and a suitable prediction of the adjoint optimal system...

Estimació del pol i de la variància del soroll d'un model AR (1) mitjançant filtratge no lineal.

M.ª Pilar Muñoz Gracia, Juan José Egozcue Rubí, Manuel Martí Recobert (1988)

Qüestiió

La estimación de los parámetros asociados a un proceso ARMA puede plantearse como un problema de filtrado no lineal. Para determinar un estimador recursivo de estos parámetros se define un vector de estado ampliado que incluye las variables de estado y los parámetros a estimar. Con un enfoque bayesiano se determina la distribución a posteriori del vector de estado ampliado. La síntesis del filtro no lineal permite: i) estimar los parámetros y determinar su precisión para un tamaño de muestra dado,...

Estimates for perturbations of average Markov decision processes with a minimal state and upper bounded by stochastically ordered Markov chains

Raúl Montes-de-Oca, Francisco Salem-Silva (2005)

Kybernetika

This paper deals with Markov decision processes (MDPs) with real state space for which its minimum is attained, and that are upper bounded by (uncontrolled) stochastically ordered (SO) Markov chains. We consider MDPs with (possibly) unbounded costs, and to evaluate the quality of each policy, we use the objective function known as the average cost. For this objective function we consider two Markov control models and 1 . and 1 have the same components except for the transition laws. The transition...

Estimation and control in finite Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena, Raúl Montes-de-Oca (2004)

Applicationes Mathematicae

This work concerns Markov decision chains with finite state and action sets. The transition law satisfies the simultaneous Doeblin condition but is unknown to the controller, and the problem of determining an optimal adaptive policy with respect to the average reward criterion is addressed. A subset of policies is identified so that, when the system evolves under a policy in that class, the frequency estimators of the transition law are consistent on an essential set of admissible state-action pairs,...

Estimation of feedwater heater parameters based on a grey-box approach

Tomasz Barszcz, Piotr Czop (2011)

International Journal of Applied Mathematics and Computer Science

The first-principle modeling of a feedwater heater operating in a coal-fired power unit is presented, along with a theoretical discussion concerning its structural simplifications, parameter estimation, and dynamical validation. The model is a part of the component library of modeling environments, called the Virtual Power Plant (VPP). The main purpose of the VPP is simulation of power generation installations intended for early warning diagnostic applications. The model was developed in the Matlab/Simulink...

Estimation of hidden Markov models for a partially observed risk sensitive control problem

Bernard Frankpitt, John S. Baras (1998)

Kybernetika

This paper provides a summary of our recent work on the problem of combined estimation and control of systems described by finite state, hidden Markov models. We establish the stochastic framework for the problem, formulate a separated control policy with risk-sensitive cost functional, describe an estimation scheme for the parameters of the hidden Markov model that describes the plant, and finally indicate how the combined estimation and control problem can be re-formulated in a framework that...

Currently displaying 261 – 280 of 862