Optimal estimators in learning theory
Banach Center Publications (2006)
- Volume: 72, Issue: 1, page 341-366
- ISSN: 0137-6934
Access Full Article
topAbstract
topHow to cite
topV. N. Temlyakov. "Optimal estimators in learning theory." Banach Center Publications 72.1 (2006): 341-366. <http://eudml.org/doc/282344>.
@article{V2006,
abstract = {This paper is a survey of recent results on some problems of supervised learning in the setting formulated by Cucker and Smale. Supervised learning, or learning-from-examples, refers to a process that builds on the base of available data of inputs $x_i$ and outputs $y_i$, i = 1,...,m, a function that best represents the relation between the inputs x ∈ X and the corresponding outputs y ∈ Y. The goal is to find an estimator $f_\{z\}$ on the base of given data $z: = ((x₁,y₁),...,(x_m,y_m))$ that approximates well the regression function $f_ρ$ of an unknown Borel probability measure ρ defined on Z = X × Y. We assume that $(x_i,y_i)$, i = 1,...,m, are indepent and distributed according to ρ. We discuss a problem of finding optimal (in the sense of order) estimators for different classes Θ (we assume $f_ρ ∈ Θ$). It is known from the previous works that the behavior of the entropy numbers ϵₙ(Θ,B) of Θ in a Banach space B plays an important role in the above problem. The standard way of measuring the error between a target function $f_ρ$ and an estimator $f_\{z\}$ is to use the $L₂(ρ_X)$ norm ($ρ_X$ is the marginal probability measure on X generated by ρ). The usual way in regression theory to evaluate the performance of the estimator $f_\{z\}$ is by studying its convergence in expectation, i.e. the rate of decay of the quantity $E(||f_\{ρ\} - f_\{z\}||²_\{L₂(ρ_X)\})$ as the sample size m increases. Here the expectation is taken with respect to the product measure $ρ^m$ defined on $Z^m$. A more accurate and more delicate way of evaluating the performance of $f_\{z\}$ has been pushed forward in [CS]. In [CS] the authors study the probability distribution function
$ρ^m\{z: ||f_\{ρ\} - f_\{z\}||_\{L₂(ρ_X)\} ≥ η\}$
instead of the expectation $E(||f_\{ρ\} - f_\{z\}||²_\{L₂(ρ_X)\})$. In this survey we mainly discuss the optimization problem formulated in terms of the probability distribution function.},
author = {V. N. Temlyakov},
journal = {Banach Center Publications},
language = {eng},
number = {1},
pages = {341-366},
title = {Optimal estimators in learning theory},
url = {http://eudml.org/doc/282344},
volume = {72},
year = {2006},
}
TY - JOUR
AU - V. N. Temlyakov
TI - Optimal estimators in learning theory
JO - Banach Center Publications
PY - 2006
VL - 72
IS - 1
SP - 341
EP - 366
AB - This paper is a survey of recent results on some problems of supervised learning in the setting formulated by Cucker and Smale. Supervised learning, or learning-from-examples, refers to a process that builds on the base of available data of inputs $x_i$ and outputs $y_i$, i = 1,...,m, a function that best represents the relation between the inputs x ∈ X and the corresponding outputs y ∈ Y. The goal is to find an estimator $f_{z}$ on the base of given data $z: = ((x₁,y₁),...,(x_m,y_m))$ that approximates well the regression function $f_ρ$ of an unknown Borel probability measure ρ defined on Z = X × Y. We assume that $(x_i,y_i)$, i = 1,...,m, are indepent and distributed according to ρ. We discuss a problem of finding optimal (in the sense of order) estimators for different classes Θ (we assume $f_ρ ∈ Θ$). It is known from the previous works that the behavior of the entropy numbers ϵₙ(Θ,B) of Θ in a Banach space B plays an important role in the above problem. The standard way of measuring the error between a target function $f_ρ$ and an estimator $f_{z}$ is to use the $L₂(ρ_X)$ norm ($ρ_X$ is the marginal probability measure on X generated by ρ). The usual way in regression theory to evaluate the performance of the estimator $f_{z}$ is by studying its convergence in expectation, i.e. the rate of decay of the quantity $E(||f_{ρ} - f_{z}||²_{L₂(ρ_X)})$ as the sample size m increases. Here the expectation is taken with respect to the product measure $ρ^m$ defined on $Z^m$. A more accurate and more delicate way of evaluating the performance of $f_{z}$ has been pushed forward in [CS]. In [CS] the authors study the probability distribution function
$ρ^m{z: ||f_{ρ} - f_{z}||_{L₂(ρ_X)} ≥ η}$
instead of the expectation $E(||f_{ρ} - f_{z}||²_{L₂(ρ_X)})$. In this survey we mainly discuss the optimization problem formulated in terms of the probability distribution function.
LA - eng
UR - http://eudml.org/doc/282344
ER -
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.