Optimal model selection in density estimation

Matthieu Lerasle

Annales de l'I.H.P. Probabilités et statistiques (2012)

  • Volume: 48, Issue: 3, page 884-908
  • ISSN: 0246-0203

Abstract

top
In order to calibrate a penalization procedure for model selection, the statistician has to choose a shape for the penalty and a leading constant. In this paper, we study, for the marginal density estimation problem, the resampling penalties as general estimators of the shape of an ideal penalty. We prove that the selected estimator satisfies sharp oracle inequalities without remainder terms under a few assumptions on the marginal density s and the collection of models. We also study the slope heuristic, which yields a data-driven choice of the leading constant in front of the penalty when the complexity of the models is well-chosen.

How to cite

top

Lerasle, Matthieu. "Optimal model selection in density estimation." Annales de l'I.H.P. Probabilités et statistiques 48.3 (2012): 884-908. <http://eudml.org/doc/271968>.

@article{Lerasle2012,
abstract = {In order to calibrate a penalization procedure for model selection, the statistician has to choose a shape for the penalty and a leading constant. In this paper, we study, for the marginal density estimation problem, the resampling penalties as general estimators of the shape of an ideal penalty. We prove that the selected estimator satisfies sharp oracle inequalities without remainder terms under a few assumptions on the marginal density $s$ and the collection of models. We also study the slope heuristic, which yields a data-driven choice of the leading constant in front of the penalty when the complexity of the models is well-chosen.},
author = {Lerasle, Matthieu},
journal = {Annales de l'I.H.P. Probabilités et statistiques},
keywords = {density estimation; optimal model selection; resampling methods; slope heuristic},
language = {eng},
number = {3},
pages = {884-908},
publisher = {Gauthier-Villars},
title = {Optimal model selection in density estimation},
url = {http://eudml.org/doc/271968},
volume = {48},
year = {2012},
}

TY - JOUR
AU - Lerasle, Matthieu
TI - Optimal model selection in density estimation
JO - Annales de l'I.H.P. Probabilités et statistiques
PY - 2012
PB - Gauthier-Villars
VL - 48
IS - 3
SP - 884
EP - 908
AB - In order to calibrate a penalization procedure for model selection, the statistician has to choose a shape for the penalty and a leading constant. In this paper, we study, for the marginal density estimation problem, the resampling penalties as general estimators of the shape of an ideal penalty. We prove that the selected estimator satisfies sharp oracle inequalities without remainder terms under a few assumptions on the marginal density $s$ and the collection of models. We also study the slope heuristic, which yields a data-driven choice of the leading constant in front of the penalty when the complexity of the models is well-chosen.
LA - eng
KW - density estimation; optimal model selection; resampling methods; slope heuristic
UR - http://eudml.org/doc/271968
ER -

References

top
  1. [1] H. Akaike. Statistical predictor identification. Ann. Inst. Statist. Math.22 (1970) 203–217. Zbl0259.62076MR286233
  2. [2] H. Akaike. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971) 267–281. Akadémiai Kiadó, Budapest, 1973. Zbl0283.62006MR483125
  3. [3] S. Arlot. Resampling and model selection. Ph.D. thesis, Université Paris-Sud 11, 2007. Zbl1326.62097
  4. [4] S. Arlot. Model selection by resampling penalization. Electron. J. Stat.3 (2009) 557–624. Zbl1326.62097MR2519533
  5. [5] S. Arlot and P. Massart. Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res.10 (2009) 245–279. 
  6. [6] A. Barron, L. Birgé and P. Massart. Risk bounds for model selection via penalization. Probab. Theory Related Fields 113(3) (1999) 301–413. Zbl0946.62036MR1679028
  7. [7] J.-P. Baudry, K. Maugis and B. Michel. Slope heuristics: Overview and implementation. Report INRIA, 2010. Available at http://hal.archives-ouvertes.fr/hal-00461639/fr/. Zbl1322.62007
  8. [8] L. Birgé. Model selection for density estimation with l 2 -loss. Preprint, 2008. 
  9. [9] L. Birgé and P. Massart. From model selection to adaptive estimation. In Festschrift for Lucien Le Cam 55–87. Springer, New York, 1997. Zbl0920.62042MR1462939
  10. [10] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related Fields 138(1–2) (2007) 33–73. Zbl1112.62082MR2288064
  11. [11] O. Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes. C. R. Math. Acad. Sci. Paris 334(6) (2002) 495–500. Zbl1001.60021MR1890640
  12. [12] F. Bunea, A. B. Tsybakov and M. H. Wegkamp. Sparse density estimation with 1 penalties. In Learning Theory 530–543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007. Zbl1203.62053MR2397610
  13. [13] A. Célisse. Density estimation via cross validation: Model selection point of view. Preprint, 2008. Available at arXiv.org:08110802. Zbl05564640
  14. [14] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian and D. Picard. Density estimation by wavelet thresholding. Ann. Statist. 24(2) (1996) 508–539. Zbl0860.62032MR1394974
  15. [15] B. Efron. Bootstrap methods: Another look at the jackknife. Ann. Statist. 7(1) (1979) 1–26. Zbl0406.62024MR515681
  16. [16] I. Gannaz and O. Wintenberger. Adaptive density estimation under dependence. ESAIM Probab. Stat.14 (2010) 151–172. Zbl1209.62056MR2654551
  17. [17] C. Houdré and P. Reynaud-Bouret. Exponential inequalities, with constants, for U-statistics of order two. In Stochastic Inequalities and Applications 55–69. Progr. Probab. 56. Birkhäuser, Basel, 2003. Zbl1036.60015MR2073426
  18. [18] T. Klein and E. Rio. Concentration around the mean for maxima of empirical processes. Ann. Probab. 33(3) (2005) 1060–1077. Zbl1066.60023MR2135312
  19. [19] C. L. Mallows. Some comments on c p . Technometrics15 (1973) 661–675. Zbl0269.62061
  20. [20] P. Massart. Concentration Inequalities and Model Selection. Lecture Notes in Mathematics 1896. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, With a foreword by Jean Picard. Zbl1170.60006MR2319879
  21. [21] P. Rigollet. Adaptive density estimation using the blockwise Stein method. Bernoulli 12(2) (2006) 351–370. Zbl1098.62040MR2218559
  22. [22] P. Rigollet and A. B. Tsybakov. Linear and convex aggregation of density estimators. Math. Methods Statist. 16(3) (2007) 260–280. Zbl1231.62057MR2356821
  23. [23] M. Rudemo. Empirical choice of histograms and kernel density estimators. Scand. J. Stat. 9(2) (1982) 65–78. Zbl0501.62028MR668683
  24. [24] G. Schwarz. Estimating the dimension of a model. Ann. Statist.6 (1978) 461–464. Zbl0379.62005MR468014
  25. [25] M. Stone. Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. Ser. B Stat. Methodol. 36 (1974) 111–147. With discussion by G. A. Barnard, A. C. Atkinson, L. K. Chan, A. P. Dawid, F. Downton, J. Dickey, A. G. Baker, O. Barndorff-Nielsen, D. R. Cox, S. Giesser, D. Hinkley, R. R. Hocking and A. S. Young and with a reply by the authors. Zbl0308.62063MR356377
  26. [26] M. Talagrand. New concentration inequalities in product spaces. Invent. Math. 126(3) (1996) 505–563. Zbl0893.60001MR1419006

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.