Challenging the empirical mean and empirical variance: A deviation study
Annales de l'I.H.P. Probabilités et statistiques (2012)
- Volume: 48, Issue: 4, page 1148-1185
- ISSN: 0246-0203
Access Full Article
topAbstract
topHow to cite
topCatoni, Olivier. "Challenging the empirical mean and empirical variance: A deviation study." Annales de l'I.H.P. Probabilités et statistiques 48.4 (2012): 1148-1185. <http://eudml.org/doc/272006>.
@article{Catoni2012,
abstract = {We present new M-estimators of the mean and variance of real valued random variables, based on PAC-Bayes bounds. We analyze the non-asymptotic minimax properties of the deviations of those estimators for sample distributions having either a bounded variance or a bounded variance and a bounded kurtosis. Under those weak hypotheses, allowing for heavy-tailed distributions, we show that the worst case deviations of the empirical mean are suboptimal. We prove indeed that for any confidence level, there is some M-estimator whose deviations are of the same order as the deviations of the empirical mean of a Gaussian statistical sample, even when the statistical sample is instead heavy-tailed. Experiments reveal that these new estimators perform even better than predicted by our bounds, showing deviation quantile functions uniformly lower at all probability levels than the empirical mean for non-Gaussian sample distributions as simple as the mixture of two Gaussian measures.},
author = {Catoni, Olivier},
journal = {Annales de l'I.H.P. Probabilités et statistiques},
keywords = {non-parametric estimation; M-estimators; PAC-Bayes bounds; nonparametric estimation},
language = {eng},
number = {4},
pages = {1148-1185},
publisher = {Gauthier-Villars},
title = {Challenging the empirical mean and empirical variance: A deviation study},
url = {http://eudml.org/doc/272006},
volume = {48},
year = {2012},
}
TY - JOUR
AU - Catoni, Olivier
TI - Challenging the empirical mean and empirical variance: A deviation study
JO - Annales de l'I.H.P. Probabilités et statistiques
PY - 2012
PB - Gauthier-Villars
VL - 48
IS - 4
SP - 1148
EP - 1185
AB - We present new M-estimators of the mean and variance of real valued random variables, based on PAC-Bayes bounds. We analyze the non-asymptotic minimax properties of the deviations of those estimators for sample distributions having either a bounded variance or a bounded variance and a bounded kurtosis. Under those weak hypotheses, allowing for heavy-tailed distributions, we show that the worst case deviations of the empirical mean are suboptimal. We prove indeed that for any confidence level, there is some M-estimator whose deviations are of the same order as the deviations of the empirical mean of a Gaussian statistical sample, even when the statistical sample is instead heavy-tailed. Experiments reveal that these new estimators perform even better than predicted by our bounds, showing deviation quantile functions uniformly lower at all probability levels than the empirical mean for non-Gaussian sample distributions as simple as the mixture of two Gaussian measures.
LA - eng
KW - non-parametric estimation; M-estimators; PAC-Bayes bounds; nonparametric estimation
UR - http://eudml.org/doc/272006
ER -
References
top- [1] P. Alquier. PAC-Bayesian bounds for randomized empirical risk minimizers. Math. Methods Statist.17 (2008) 279–304. Zbl1260.62038MR2483458
- [2] J.-Y. Audibert. A better variance control for PAC-Bayesian classification. Preprint n.905bis, Laboratoire de Probabilités et Modèles Aléatoires, Universités Paris 6 and Paris 7, 2004. Available at http://www.proba.jussieu.fr/mathdoc/textes/PMA-905Bis.pdf.
- [3] J.-Y. Audibert and O. Catoni. Robust linear least squares regression. Ann. Statist.39 (2011) 2766–2794. Zbl1231.62126MR2906886
- [4] J.-Y. Audibert and O. Catoni. Robust linear regression through PAC-Bayesian truncation. Unpublished manuscript, 2010. Available at http://hal.inria.fr/hal-00522536.
- [5] R. Beran. An efficient and robust adaptive estimator of location. Ann. Statist.6 (1978) 292–313. Zbl0378.62051MR518885
- [6] P. J. Bickel. On adaptive estimation. Ann. Statist.10 (1982) 647–671. Zbl0489.62033MR663424
- [7] O. Catoni. Statistical Learning Theory and Stochastic Optimization: École d’Été de Probabilités de Saint-Flour XXXI – 2001. Lecture Notes in Math. 1851. Springer, Berlin, 2004. Zbl1076.93002MR2163920
- [8] O. Catoni. PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning. IMS Lecture Notes Monogr. Ser. 56. Institute of Mathematical Statistics, Beachwood, OH, 2007. Zbl1277.62015MR2483528
- [9] P. J. Huber. Robust estimation of a location parameter. Ann. Math. Statist.35 (1964) 73–101. Zbl0136.39805MR161415
- [10] P. J. Huber. Robust Statistics. Wiley Series in Probability and Mathematical Statistics. Wiley-Interscience, New York, 1981. Zbl1276.62022MR606374
- [11] O. Lepski. Asymptotically minimax adaptive estimation I: Upper bounds. Optimally adaptive estimates. Theory Probab. Appl. 36 (1991) 682–697. Zbl0776.62039MR1147167
- [12] D. A. McAllester. PAC-Bayesian model averaging. In Proceedings of the 12th Annual Conference on Computational Learning Theory. Morgan Kaufmann, New York, 1999. Zbl0945.68157MR1811612
- [13] D. A. McAllester. Some PAC-Bayesian theorems. Mach. Learn.37 (1999) 355–363. Zbl0945.68157MR1811587
- [14] D. A. McAllester. PAC-Bayesian stochastic model selection. Mach. Learn.51 (2003) 5–21. Zbl1056.68122
- [15] C. J. Stone. Adaptive maximum likelihood estimators of a location parameter. Ann. Statist.3 (1975) 267–284. Zbl0303.62026MR362669
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.