Page 1 Next

Displaying 1 – 20 of 21

Showing per page

Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion

Evgueni I. Gordienko, J. Adolfo Minjárez-Sosa (1998)

Kybernetika

We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations x t + 1 = F ( x t , a t , ξ t ) , t = 0 , 1 , ... with i.i.d. k -valued random vectors ξ t whose density ρ is unknown. Assuming observability of ξ t we propose the procedure of statistical estimation of ρ that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded costs.

Data-driven models for fault detection using kernel PCA: A water distribution system case study

Adam Nowicki, Michał Grochowski, Kazimierz Duzinkiewicz (2012)

International Journal of Applied Mathematics and Computer Science

Kernel Principal Component Analysis (KPCA), an example of machine learning, can be considered a non-linear extension of the PCA method. While various applications of KPCA are known, this paper explores the possibility to use it for building a data-driven model of a non-linear system-the water distribution system of the Chojnice town (Poland). This model is utilised for fault detection with the emphasis on water leakage detection. A systematic description of the system's framework is followed by...

Double-stepped adaptive control for hybrid systems with unknown Markov jumps and stochastic noises

Shuping Tan, Ji-Feng Zhang (2009)

ESAIM: Control, Optimisation and Calculus of Variations

This paper is concerned with the sampled-data based adaptive linear quadratic (LQ) control of hybrid systems with both unmeasurable Markov jump processes and stochastic noises. By the least matching error estimation algorithm, parameter estimates are presented. By a double-step (DS) sampling approach and the certainty equivalence principle, a sampled-data based adaptive LQ control is designed. The DS-approach is characterized by a comparatively large estimation step for parameter estimation and...

Double-stepped adaptive control for hybrid systems with unknown Markov jumps and stochastic noises

Shuping Tan, Ji-Feng Zhang (2008)

ESAIM: Control, Optimisation and Calculus of Variations

This paper is concerned with the sampled-data based adaptive linear quadratic (LQ) control of hybrid systems with both unmeasurable Markov jump processes and stochastic noises. By the least matching error estimation algorithm, parameter estimates are presented. By a double-step (DS) sampling approach and the certainty equivalence principle, a sampled-data based adaptive LQ control is designed. The DS-approach is characterized by a comparatively large estimation step for parameter estimation and...

Employing different loss functions for the classification of images via supervised learning

Radu Boţ, André Heinrich, Gert Wanka (2014)

Open Mathematics

Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions....

Epoch-incremental reinforcement learning algorithms

Roman Zajdel (2013)

International Journal of Applied Mathematics and Computer Science

In this article, a new class of the epoch-incremental reinforcement learning algorithm is proposed. In the incremental mode, the fundamental TD(0) or TD(λ) algorithm is performed and an environment model is created. In the epoch mode, on the basis of the environment model, the distances of past-active states to the terminal state are computed. These distances and the reinforcement terminal state signal are used to improve the agent policy.

Extraction of fuzzy rules using deterministic annealing integrated with ε-insensitive learning

Robert Czabański (2006)

International Journal of Applied Mathematics and Computer Science

A new method of parameter estimation for an artificial neural network inference system based on a logical interpretation of fuzzy if-then rules (ANBLIR) is presented. The novelty of the learning algorithm consists in the application of a deterministic annealing method integrated with ε-insensitive learning. In order to decrease the computational burden of the learning procedure, a deterministic annealing method with a "freezing" phase and ε-insensitive learning by solving a system of linear inequalities...

Neural network-based MRAC control of dynamic nonlinear systems

Ghania Debbache, Abdelhak Bennia, Noureddine Golea (2006)

International Journal of Applied Mathematics and Computer Science

This paper presents direct model reference adaptive control for a class of nonlinear systems with unknown nonlinearities. The model following conditions are assured by using adaptive neural networks as the nonlinear state feedback controller. Both full state information and observer-based schemes are investigated. All the signals in the closed loop are guaranteed to be bounded and the system state is proven to converge to a small neighborhood of the reference model state. It is also shown that stability...

Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

J. Minjárez-Sosa (1999)

Applicationes Mathematicae

We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations x t + 1 = F ( x t , a t , ξ t ) , t=1,2,..., with i.i.d. k -valued random vectors ξ t , which are observable but whose density ϱ is unknown.

Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms

Bogdan Trawiński, Magdalena Smętek, Zbigniew Telec, Tadeusz Lasota (2012)

International Journal of Applied Mathematics and Computer Science

In the paper we present some guidelines for the application of nonparametric statistical tests and post-hoc procedures devised to perform multiple comparisons of machine learning algorithms. We emphasize that it is necessary to distinguish between pairwise and multiple comparison tests. We show that the pairwise Wilcoxon test, when employed to multiple comparisons, will lead to overoptimistic conclusions. We carry out intensive normality examination employing ten different tests showing that the...

On nearly selfoptimizing strategies for multiarmed bandit problems with controlled arms

Ewa Drabik (1996)

Applicationes Mathematicae

Two kinds of strategies for a multiarmed Markov bandit problem with controlled arms are considered: a strategy with forcing and a strategy with randomization. The choice of arm and control function in both cases is based on the current value of the average cost per unit time functional. Some simulation results are also presented.

On the discrete time-varying JLQG problem

Adam Czornik, Andrzej Świerniak (2002)

International Journal of Applied Mathematics and Computer Science

In the present paper optimal time-invariant state feedback controllers are designed for a class of discrete time-varying control systems with Markov jumping parameter and quadratic performance index. We assume that the coefficients have limits as time tends to infinity and the boundary system is absolutely observable and stabilizable. Moreover, following the same line of reasoning, an adaptive controller is proposed in the case when system parameters are unknown but their strongly consistent estimators...

Recursive self-tuning control of finite Markov chains

Vivek Borkar (1997)

Applicationes Mathematicae

A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.

Currently displaying 1 – 20 of 21

Page 1 Next