IFS approximations of distribution functions and related optimization problems.
The purpose of feature selection in machine learning is at least two-fold - saving measurement acquisition costs and reducing the negative effects of the curse of dimensionality with the aim to improve the accuracy of the models and the classification rate of classifiers with respect to previously unknown data. Yet it has been shown recently that the process of feature selection itself can be negatively affected by the very same curse of dimensionality - feature selection methods may easily over-fit...
For a binary stationary time series define to be the number of consecutive ones up to the first zero encountered after time , and consider the problem of estimating the conditional distribution and conditional expectation of after one has observed the first outputs. We present a sequence of stopping times and universal estimators for these quantities which are pointwise consistent for all ergodic binary stationary processes. In case the process is a renewal process with zero the renewal state...
We give some estimation schemes for the conditional distribution and conditional expectation of the the next output following the observation of the first outputs of a stationary process where the random variables may take finitely many possible values. Our schemes are universal in the class of finitarily Markovian processes that have an exponential rate for the tail of the look back time distribution. In addition explicit rates are given. A necessary restriction is that the scheme proposes an...
In the context of high frequency data, one often has to deal with observations occurring at irregularly spaced times, at transaction times for example in finance. Here we examine how the estimation of the squared or other powers of the volatility is affected by irregularly spaced data. The emphasis is on the kind of assumptions on the sampling scheme which allow to provide consistent estimators, together with an associated central limit theorem, and especially when the sampling scheme depends on...
This paper presents a new algorithm to perform regression estimation, in both the inductive and transductive setting. The estimator is defined as a linear combination of functions in a given dictionary. Coefficients of the combinations are computed sequentially using projection on some simple sets. These sets are defined as confidence regions provided by a deviation (PAC) inequality on an estimator in one-dimensional models. We prove that every projection the algorithm actually improves the performance...