Data compression in discriminating stochastic processes
In the companion paper [C. Maugis and B. Michel, A non asymptotic penalized criterion for Gaussian mixture model selection. ESAIM: P&S 15 (2011) 41–68] , a penalized likelihood criterion is proposed to select a Gaussian mixture model among a specific model collection. This criterion depends on unknown constants which have to be calibrated in practical situations. A “slope heuristics” method is described and experimented to deal with this practical problem. In a model-based clustering context,...
In the companion paper [C. Maugis and B. Michel, A non asymptotic penalized criterion for Gaussian mixture model selection. ESAIM: P&S15 (2011) 41–68] , a penalized likelihood criterion is proposed to select a Gaussian mixture model among a specific model collection. This criterion depends on unknown constants which have to be calibrated in practical situations. A “slope heuristics” method is described and experimented to deal with this practical problem. In a model-based clustering context, the...
We establish consistent estimators of jump positions and jump altitudes of a multi-level step function that is the best -approximation of a probability density function . If itself is a step-function the number of jumps may be unknown.
We propose a feature selection method for density estimation with quadratic loss. This method relies on the study of unidimensional approximation models and on the definition of confidence regions for the density thanks to these models. It is quite general and includes cases of interest like detection of relevant wavelets coefficients or selection of support vectors in SVM. In the general case, we prove that every selected feature actually improves the performance of the estimator. In the case...
In this paper we consider a smoothness parameter estimation problem for a density function. The smoothness parameter of a function is defined in terms of Besov spaces. This paper is an extension of recent results (K. Dziedziul, M. Kucharska, B. Wolnik, Estimation of the smoothness parameter). The construction of the estimator is based on wavelets coefficients. Although we believe that the effective estimation of the smoothness parameter is impossible in general case, we can show that it becomes...
In this paper, a very useful lemma (in two versions) is proved: it simplifies notably the essential step to establish a Lindeberg central limit theorem for dependent processes. Then, applying this lemma to weakly dependent processes introduced in Doukhan and Louhichi (1999), a new central limit theorem is obtained for sample mean or kernel density estimator. Moreover, by using the subsampling, extensions under weaker assumptions of these central limit theorems are provided. All the usual causal...
This paper is devoted to the study of some asymptotic properties of a -estimator in a framework of detection of abrupt changes in random field’s distribution. This class of problems includes e.g. recovery of sets. It involves various techniques, including -estimation method, concentration inequalities, maximal inequalities for dependent random variables and -mixing. Penalization of the criterion function when the size of the true model is unknown is performed. All the results apply under mild,...
This paper is devoted to the study of some asymptotic properties of a M-estimator in a framework of detection of abrupt changes in random field's distribution. This class of problems includes e.g. recovery of sets. It involves various techniques, including M-estimation method, concentration inequalities, maximal inequalities for dependent random variables and ϕ-mixing. Penalization of the criterion function when the size of the true model is unknown is performed. All the results apply under...
To validate pollution data, subject-matter experts in Airpl (an organization that maintains a network of air pollution monitoring stations in western France) daily perform visual examinations of the data and check their consistency. In this paper, we describe these visual examinations and propose a formalization for this problem. The examinations consist in comparisons of so-called shorth intervals so we build a statistical test that compares such intervals in a nonparametric regression model. This...
To validate pollution data, subject-matter experts in Airpl (an organization that maintains a network of air pollution monitoring stations in western France) daily perform visual examinations of the data and check their consistency. In this paper, we describe these visual examinations and propose a formalization for this problem. The examinations consist in comparisons of so-called shorth intervals so we build a statistical test that compares such intervals in a nonparametric regression model. This...
The purpose of this paper is to investigate the deviation inequalities and the moderate deviation principle of the least squares estimators of the unknown parameters of general th-order asymmetric bifurcating autoregressive processes, under suitable assumptions on the driven noise of the process. Our investigation relies on the moderate deviation principle for martingales.
For a doubly truncated exponential distribution, the probability density function of a quasi-range is derived. From this the density of sample range is obtained as a special case. Expressions for the mean and variance of the range are also obtained.