Model selection via testing : an alternative to (penalized) maximum likelihood estimators
Annales de l'I.H.P. Probabilités et statistiques (2006)
- Volume: 42, Issue: 3, page 273-325
- ISSN: 0246-0203
Access Full Article
topHow to cite
topBirgé, Lucien. "Model selection via testing : an alternative to (penalized) maximum likelihood estimators." Annales de l'I.H.P. Probabilités et statistiques 42.3 (2006): 273-325. <http://eudml.org/doc/77897>.
@article{Birgé2006,
author = {Birgé, Lucien},
journal = {Annales de l'I.H.P. Probabilités et statistiques},
keywords = {maximum likelihood; robustness; robust tests; metric dimension; minimax risk; model selection; aggregation of estimators},
language = {eng},
number = {3},
pages = {273-325},
publisher = {Elsevier},
title = {Model selection via testing : an alternative to (penalized) maximum likelihood estimators},
url = {http://eudml.org/doc/77897},
volume = {42},
year = {2006},
}
TY - JOUR
AU - Birgé, Lucien
TI - Model selection via testing : an alternative to (penalized) maximum likelihood estimators
JO - Annales de l'I.H.P. Probabilités et statistiques
PY - 2006
PB - Elsevier
VL - 42
IS - 3
SP - 273
EP - 325
LA - eng
KW - maximum likelihood; robustness; robust tests; metric dimension; minimax risk; model selection; aggregation of estimators
UR - http://eudml.org/doc/77897
ER -
References
top- [1] P. Assouad, Deux remarques sur l'estimation, C. R. Acad. Sci. Paris, Sér. I Math.296 (1983) 1021-1024. Zbl0568.62003MR777600
- [2] J.-Y. Audibert, Théorie statistique de l'apprentissage : une approche PAC-bayésienne, Thèse de doctorat, Laboratoire de Probabilités et Modèles Aléatoires, Université Paris VI, Paris, 2004.
- [3] Y. Baraud, Model selection for regression on a random design, ESAIM Probab. Statist.6 (2002) 127-146. Zbl1059.62038MR1918295
- [4] A.R. Barron, Complexity regularization with applications to artificial neural networks, in: Roussas G. (Ed.), Nonparametric Functional Estimation, Kluwer, Dordrecht, 1991, pp. 561-576. Zbl0739.62001MR1154352
- [5] A.R. Barron, L. Birgé, P. Massart, Risk bounds for model selection via penalization, Probab. Theory Related Fields113 (1999) 301-415. Zbl0946.62036MR1679028
- [6] A.R. Barron, T.M. Cover, Minimum complexity density estimation, IEEE Trans. Inform. Theory37 (1991) 1034-1054. Zbl0743.62003MR1111806
- [7] J. Beirlant, L. Györfi, On the asymptotic normality of the -error in partitioning regression estimation, J. Statist. Plann. Inference71 (1998) 93-107. Zbl0961.62030MR1651863
- [8] L. Birgé, Approximation dans les espaces métriques et théorie de l'estimation, Z. Wahrscheinlichkeitstheorie Verw. Gebiete65 (1983) 181-237. Zbl0506.62026MR722129
- [9] L. Birgé, Sur un théorème de minimax et son application aux tests, Probab. Math. Statist.3 (1984) 259-282. Zbl0571.62036MR764150
- [10] L. Birgé, Stabilité et instabilité du risque minimax pour des variables indépendantes équidistribuées, Ann. Inst. H. Poincaré Sect. B20 (1984) 201-223. Zbl0542.62018MR762855
- [11] L. Birgé, On estimating a density using Hellinger distance and some other strange facts, Probab. Theory Related Fields71 (1986) 271-291. Zbl0561.62029MR816706
- [12] L. Birgé, Model selection for Gaussian regression with random design, Bernoulli10 (2004) 1039-1051. Zbl1064.62030MR2108042
- [13] L. Birgé, P. Massart, Rates of convergence for minimum contrast estimators, Probab. Theory Related Fields97 (1993) 113-150. Zbl0805.62037MR1240719
- [14] L. Birgé, P. Massart, From model selection to adaptive estimation, in: Pollard D., Torgersen E., Yang G. (Eds.), Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, Springer-Verlag, New York, 1997, pp. 55-87. Zbl0920.62042MR1462939
- [15] L. Birgé, P. Massart, Minimum contrast estimators on sieves: exponential bounds and rates of convergence, Bernoulli4 (1998) 329-375. Zbl0954.62033MR1653272
- [16] L. Birgé, P. Massart, An adaptive compression algorithm in Besov spaces, Constr. Approx.16 (2000) 1-36. Zbl1004.41006MR1848840
- [17] L. Birgé, P. Massart, Gaussian model selection, J. Eur. Math. Soc.3 (2001) 203-268. Zbl1037.62001MR1848946
- [18] M.S. Birman, M.Z. Solomjak, Piecewise-polynomial approximation of functions of the classes , Mat. Sb.73 (1967) 295-317. Zbl0173.16001MR217487
- [19] L.D. Brown, M.G. Low, Asymptotic equivalence of nonparametric regression and white noise, Ann. Statist.24 (1996) 2384-2398. Zbl0867.62022MR1425958
- [20] F. Bunea, A.B. Tsybakov, M.H. Wegkamp, Aggregation for regression learning, Technical report 948, Laboratoire de Probabilités, Université Paris VI, 2004, http://www.proba.jussieu.fr/mathdoc/preprints/index.html#2004. Zbl1209.62065
- [21] G. Castellan, Modified Akaike's criterion for histogram density estimation, Technical report 99.61, Université Paris-Sud, Orsay, 1999, http://www.math.u-psud.fr/~biblio/pub/1999/.
- [22] G. Castellan, Sélection d'histogrammes à l'aide d'un critère de type Akaike, C. R. Acad. Sci. Paris330 (2000) 729-732. Zbl0969.62023MR1763919
- [23] O. Catoni, The mixture approach to universal model selection, Technical report LMENS-97-22, Ecole Normale Supérieure, Paris, 1997, http://www.dma.ens.fr/edition/publis/1997/titre97.html. Zbl0928.62033
- [24] O. Catoni, Statistical learning theory and stochastic optimization, in: Picard J. (Ed.), Lecture on Probability Theory and Statistics, Ecole d'Eté de Probabilités de Saint-Flour XXXI – 2001, Lecture Note in Math., vol. 1851, Springer-Verlag, Berlin, 2004. Zbl1076.93002MR2163920
- [25] H. Chernoff, A measure of asymptotic efficiency of tests of a hypothesis based on a sum of observations, Ann. Math. Statist.23 (1952) 493-507. Zbl0048.11804MR57518
- [26] R.A. DeVore, G. Kerkyacharian, D. Picard, V. Temlyakov, Mathematical methods for supervised learning, Technical report 0422, IMI, University of South Carolina, Columbia, 2004, http://www.math.sc.edu/imip/preprints/04.html. Zbl1146.62322
- [27] R.A. DeVore, G.G. Lorentz, Constructive Approximation, Springer-Verlag, Berlin, 1993. Zbl0797.41016MR1261635
- [28] L. Devroye, G. Lugosi, Combinatorial Methods in Density Estimation, Springer-Verlag, New York, 2001. Zbl0964.62025MR1843146
- [29] D.L. Donoho, I.M. Johnstone, G. Kerkyacharian, D. Picard, Density estimation by wavelet thresholding, Ann. Statist.24 (1996) 508-539. Zbl0860.62032MR1394974
- [30] D.L. Donoho, R.C. Liu, B. MacGibbon, Minimax risk over hyperrectangles, and implications, Ann. Statist.18 (1990) 1416-1437. Zbl0705.62018MR1062717
- [31] P.P.B. Eggermont, V.N. LaRiccia, Maximum Penalized Likelihood Estimation, vol. I: Density Estimation, Springer, New York, 2001. Zbl0984.62026MR1837879
- [32] P. Groeneboom, Some current developments in density estimation, in: Bakker J.W. de, Hazewinkel M., Lenstra J.K. (Eds.), Mathematics and Computer Science, CWI Monograph, vol. 1, Elsevier, Amsterdam, 1986, pp. 163-192. Zbl0593.62030MR873578
- [33] L. Györfi, M. Kohler, A. Kryżak, H. Walk, A Distribution-Free Theory of Nonparametric Regression, Springer, New York, 2002. Zbl1021.62024
- [34] P.J. Huber, A robust version of the probability ratio test, Ann. Math. Statist.36 (1965) 1753-1758. Zbl0137.12702MR185747
- [35] P.J. Huber, Robust Statistics, John Wiley, New York, 1981. Zbl0536.62025MR606374
- [36] I.M. Johnstone, Chi-square oracle inequalities, in: Gunst M.C.M. de, Klaassen C.A.J., Vaart A.W. van der (Eds.), State of the Art in Probability and Statistics, Festschrift for Willem R. van Zwet, Lecture Notes Monograph Ser., vol. 36, Institute of Mathematical Statistics, 2001, pp. 399-418. MR1836572
- [37] A. Juditsky, A.S. Nemirovski, Functional aggregation for nonparametric estimation, Ann. Statist.28 (2000) 681-712. Zbl1105.62338MR1792783
- [38] G. Kerkyacharian, D. Picard, Thresholding algorithms, maxisets and well-concentrated bases, Test9 (2000) 283-344. Zbl1107.62323MR1821645
- [39] A.N. Kolmogorov, V.M. Tikhomirov, ε-entropy and ε-capacity of sets in function spaces, Amer. Math. Soc. Transl. (2)17 (1961) 277-364. Zbl0133.06703
- [40] B. Laurent, P. Massart, Adaptive estimation of a quadratic functional by model selection, Ann. Statist.28 (2000) 1302-1338. Zbl1105.62328MR1805785
- [41] L.M. Le Cam, On the assumptions used to prove asymptotic normality of maximum likelihood estimates, Ann. Math. Statist.41 (1970) 802-828. Zbl0246.62039MR267676
- [42] L.M. Le Cam, Limits of experiments, in: Proc. 6th Berkeley Symp. on Math. Stat. and Prob. I, 1972, pp. 245-261. Zbl0271.62004MR415819
- [43] L.M. Le Cam, Convergence of estimates under dimensionality restrictions, Ann. Statist.1 (1973) 38-53. Zbl0255.62006MR334381
- [44] L.M. Le Cam, On local and global properties in the theory of asymptotic normality of experiments, in: Puri M. (Ed.), Stochastic Processes and Related Topics, vol. 1, Academic Press, New York, 1975, pp. 13-54. Zbl0389.62011MR395005
- [45] L.M. Le Cam, Asymptotic Methods in Statistical Decision Theory, Springer-Verlag, New York, 1986. Zbl0605.62002MR856411
- [46] L.M. Le Cam, Maximum likelihood: an introduction, Inter. Statist. Rev.58 (1990) 153-171. Zbl0715.62045
- [47] L.M. Le Cam, Metric dimension and statistical estimation, CRM Proc. and Lecture Notes11 (1997) 303-311. Zbl0942.62035MR1479680
- [48] G.G. Lorentz, Approximation of Functions, Holt, Rinehart, Winston, New York, 1966. Zbl0153.38901MR213785
- [49] G.G. Lorentz, M. von Golitschek, Y. Makovoz, Constructive Approximation, Advanced Problems, Springer, Berlin, 1996. Zbl0910.41001MR1393437
- [50] A.S. Nemirovski, Topics in non-parametric statistics, in: Bernard P. (Ed.), Lecture on Probability Theory and Statistics, Ecole d'Eté de Probabilités de Saint-Flour XXVIII – 1998, Lecture Notes in Math., vol. 1738, Springer-Verlag, Berlin, 2000, pp. 85-297. Zbl0998.62033MR1775640
- [51] M. Nussbaum, Asymptotic equivalence of density estimation and Gaussian white noise, Ann. Statist.24 (1996) 2399-2430. Zbl0867.62035MR1425959
- [52] A. Pinkus, n-widths in Approximation Theory, Springer-Verlag, Berlin, 1985. Zbl0551.41001MR774404
- [53] M.S. Pinsker, Optimal filtration of square-integrable signals in Gaussian noise, Problems Inform. Transmission16 (1980) 120-133. Zbl0452.94003MR624591
- [54] X. Shen, W.H. Wong, Convergence rates of sieve estimates, Ann. Statist.22 (1994) 580-615. Zbl0805.62008MR1292531
- [55] B.W. Silverman, On the estimation of a probability density function by the maximum penalized likelihood method, Ann. Statist.10 (1982) 795-810. Zbl0492.62034MR663433
- [56] A.B. Tsybakov, Optimal rates of aggregation, in: Proceedings of 16th Annual Conference on Learning Theory (COLT) and 7th Annual Workshop on Kernel Machines, Lecture Notes in Artificial Intelligence, vol. 2777, Springer-Verlag, Berlin, 2003, pp. 303-313. Zbl1208.62073
- [57] S. van de Geer, Estimating a regression function, Ann. Statist.18 (1990) 907-924. Zbl0709.62040MR1056343
- [58] S. van de Geer, Hellinger-consistency of certain nonparametric maximum likelihood estimates, Ann. Statist.21 (1993) 14-44. Zbl0779.62033MR1212164
- [59] S. van de Geer, Empirical Processes in M-Estimation, Cambridge University Press, Cambridge, 2000. Zbl1179.62073MR1739079
- [60] A.W. van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge, 1998. Zbl0910.62001MR1652247
- [61] G. Wahba, Spline Models for Observational Data, SIAM, Philadelphia, PA, 1990. Zbl0813.62001MR1045442
- [62] A. Wald, Note on the consistency of the maximum likelihood estimate, Ann. Math. Statist.20 (1949) 595-601. Zbl0034.22902MR32169
- [63] M.H. Wegkamp, Model selection in nonparametric regression, Ann. Statist.31 (2003) 252-273. Zbl1019.62037MR1962506
- [64] W.H. Wong, X. Shen, Probability inequalities for likelihood ratios and convergence rates of sieve MLEs, Ann. Statist.23 (1995) 339-362. Zbl0829.62002MR1332570
- [65] Y. Yang, Minimax optimal density estimation, Ph.D. dissertation, Dept. of Statistics, Yale University, New Haven, 1996.
- [66] Y. Yang, Mixing strategies for density estimation, Ann. Statist.28 (2000) 75-87. Zbl1106.62322MR1762904
- [67] Y. Yang, Combining different procedures for adaptive regression, J. Multivariate Anal.74 (2000) 135-161. Zbl0964.62032MR1790617
- [68] Y. Yang, Adaptive regression by mixing, J. Amer. Statist. Assoc.96 (2001) 574-588. Zbl1018.62033MR1946426
- [69] Y. Yang, How accurate can any regression procedure be?, Technical report, Iowa State University, Ames, 2001, http://www.public.iastate.edu/yyang/papers/index.html.
- [70] Y. Yang, Aggregating regression procedures to improve performance, Bernoulli10 (2004) 25-47. Zbl1040.62030MR2044592
- [71] Y. Yang, A.R. Barron, An asymptotic property of model selection criteria, IEEE Trans. Inform. Theory44 (1998) 95-116. Zbl0949.62041MR1486651
- [72] Y. Yang, A.R. Barron, Information-theoretic determination of minimax rates of convergence, Ann. Statist.27 (1999) 1564-1599. Zbl0978.62008MR1742500
- [73] Y.G. Yatracos, Rates of convergence of minimum distance estimates and Kolmogorov's entropy, Ann. Statist.13 (1985) 768-774. Zbl0576.62057MR790571
- [74] B. Yu, Assouad, Fano and Le Cam, in: Pollard D., Torgersen E., Yang G. (Eds.), Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, Springer-Verlag, New York, 1997, pp. 423-435. Zbl0896.62032MR1462963
Citations in EuDML Documents
top- Nathalie Akakpo, Estimating a discrete distribution via histogram selection
- Nathalie Akakpo, Estimating a discrete distribution histogram selection
- Guillaume Lecué, Shahar Mendelson, On the optimality of the empirical risk minimization procedure for the convex aggregation problem
- Yannick Baraud, Christophe Giraud, Sylvie Huet, Estimator selection in the gaussian setting
- Yannick Baraud, Estimation of the density of a determinantal process
- Alexandre B. Tsybakov, Agrégation d’estimateurs et optimisation stochastique
- Yannick Baraud, Lucien Birgé, Estimating composite functions by model selection
- Mathieu Sart, Estimation of the transition density of a Markov chain
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.