Currently displaying 1 – 10 of 10

Showing per page

Order by Relevance | Title | Year of publication

Average cost Markov control processes with weighted norms: existence of canonical policies

Evgueni GordienkoOnésimo Hernández-Lerma — 1995

Applicationes Mathematicae

This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.

Average cost Markov control processes with weighted norms: value iteration

Evgueni GordienkoOnésimo Hernández-Lerma — 1995

Applicationes Mathematicae

This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of forecast horizons.

Comparing the distributions of sums of independent random vectors

Evgueni I. Gordienko — 2005

Kybernetika

Let ( X n , n 1 ) , ( X ˜ n , n 1 ) be two sequences of i.i.d. random vectors with values in k and S n = X 1 + + X n , S ˜ n = X ˜ 1 + + X ˜ n , n 1 . Assuming that E X 1 = E X ˜ 1 , E | X 1 | 2 < , E | X ˜ 1 | k + 2 < and the existence of a density of X ˜ 1 satisfying the certain conditions we prove the following inequalities: v ( S n , S ˜ n ) c max { v ( X 1 , X ˜ 1 ) , ζ 2 ( X 1 , X ˜ 1 ) } , n = 1 , 2 , , where v and ζ 2 are the total variation and Zolotarev’s metrics, respectively.

Estimates of stability of Markov control processes with unbounded costs

For a discrete-time Markov control process with the transition probability p , we compare the total discounted costs V β ( π β ) and V β ( π ˜ β ) , when applying the optimal control policy π β and its approximation π ˜ β . The policy π ˜ β is optimal for an approximating process with the transition probability p ˜ . A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index [ V β ( π ˜ β ) - V β ( π β ) ] / V β ( π β ) . This bound does not depend on a discount...

A note on the convergence rate in regularized stochastic programming

Evgueni I. GordienkoYury Gryazin — 2021

Kybernetika

We deal with a stochastic programming problem that can be inconsistent. To overcome the inconsistency we apply Tikhonov's regularization technique, and, using recent results on the convergence rate of empirical measures in Wasserstein metric, we treat the following two related problems: 1. A choice of regularization parameters that guarantees the convergence of the minimization procedure. 2. Estimation of the rate of convergence in probability. Considering both light and heavy tail distributions...

Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion

We study the adaptive control problem for discrete-time Markov control processes with Borel state and action spaces and possibly unbounded one-stage costs. The processes are given by recurrent equations x t + 1 = F ( x t , a t , ξ t ) , t = 0 , 1 , ... with i.i.d. k -valued random vectors ξ t whose density ρ is unknown. Assuming observability of ξ t we propose the procedure of statistical estimation of ρ that allows us to prove discounted asymptotic optimality of two types of adaptive policies used early for the processes with bounded costs.

Asymptotic properties and optimization of some non-Markovian stochastic processes

We study the limit behavior of certain classes of dependent random sequences (processes) which do not possess the Markov property. Assuming these processes depend on a control parameter we show that the optimization of the control can be reduced to a problem of nonlinear optimization. Under certain hypotheses we establish the stability of such optimization problems.

Stability estimating in optimal sequential hypotheses testing

We study the stability of the classical optimal sequential probability ratio test based on independent identically distributed observations X 1 , X 2 , when testing two simple hypotheses about their common density f : f = f 0 versus f = f 1 . As a functional to be minimized, it is used a weighted sum of the average (under f 0 ) sample number and the two types error probabilities. We prove that the problem is reduced to stopping time optimization for a ratio process generated by X 1 , X 2 , with the density f 0 . For τ * being the corresponding...

Page 1

Download Results (CSV)