The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations with i.i.d. -valued random vectors whose density is unknown. Assuming observability of we propose the procedure of statistical estimation of that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded costs.
Kernel Principal Component Analysis (KPCA), an example of machine learning, can be considered a non-linear extension of the PCA method. While various applications of KPCA are known, this paper explores the possibility to use it for building a data-driven model of a non-linear system-the water distribution system of the Chojnice town (Poland). This model is utilised for fault detection with the emphasis on water leakage detection. A systematic description of the system's framework is followed by...
This paper is concerned with the sampled-data based adaptive linear quadratic (LQ) control of hybrid systems with both unmeasurable Markov jump processes and stochastic noises. By the least matching error estimation algorithm, parameter estimates are presented. By a double-step (DS) sampling approach and the certainty equivalence principle, a sampled-data based adaptive LQ control is designed. The DS-approach is characterized by a comparatively large estimation step for parameter estimation and...
This paper is concerned with the sampled-data based adaptive
linear quadratic (LQ) control of hybrid systems with both
unmeasurable Markov jump processes and stochastic noises.
By the least matching error estimation algorithm, parameter estimates
are presented. By a double-step (DS) sampling approach and the certainty
equivalence principle, a sampled-data based adaptive LQ control is
designed. The DS-approach is characterized by a comparatively large
estimation step for parameter estimation and...
Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions....
In this article, a new class of the epoch-incremental reinforcement learning algorithm is proposed. In the incremental mode, the fundamental TD(0) or TD(λ) algorithm is performed and an environment model is created. In the epoch mode, on the basis of the environment model, the distances of past-active states to the terminal state are computed. These distances and the reinforcement terminal state signal are used to improve the agent policy.
A new method of parameter estimation for an artificial neural network inference system based on a logical interpretation of fuzzy if-then rules (ANBLIR) is presented. The novelty of the learning algorithm consists in the application of a deterministic annealing method integrated with ε-insensitive learning. In order to decrease the computational burden of the learning procedure, a deterministic annealing method with a "freezing" phase and ε-insensitive learning by solving a system of linear inequalities...
This paper presents direct model reference adaptive control for a class of nonlinear systems with unknown nonlinearities. The model following conditions are assured by using adaptive neural networks as the nonlinear state feedback controller. Both full state information and observer-based schemes are investigated. All the signals in the closed loop are guaranteed to be bounded and the system state is proven to converge to a small neighborhood of the reference model state. It is also shown that stability...
We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations , t=1,2,..., with i.i.d. -valued random vectors , which are observable but whose density ϱ is unknown.
In the paper we present some guidelines for the application of nonparametric statistical tests and post-hoc procedures devised to perform multiple comparisons of machine learning algorithms. We emphasize that it is necessary to distinguish between pairwise and multiple comparison tests. We show that the pairwise Wilcoxon test, when employed to multiple comparisons, will lead to overoptimistic conclusions. We carry out intensive normality examination employing ten different tests showing that the...
Two kinds of strategies for a multiarmed Markov bandit problem with controlled arms are considered: a strategy with forcing and a strategy with randomization. The choice of arm and control function in both cases is based on the current value of the average cost per unit time functional. Some simulation results are also presented.
In the present paper optimal time-invariant state feedback controllers are designed for a class of discrete time-varying control systems with Markov jumping parameter and quadratic performance index. We assume that the coefficients have limits as time tends to infinity and the boundary system is absolutely observable and stabilizable. Moreover, following the same line of reasoning, an adaptive controller is proposed in the case when system parameters are unknown but their strongly consistent estimators...
A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.
Currently displaying 1 –
20 of
21