Agrégation d’estimateurs et optimisation stochastique
Journal de la société française de statistique (2008)
- Volume: 149, Issue: 1, page 3-26
- ISSN: 1962-5197
Access Full Article
topAbstract
topHow to cite
topTsybakov, Alexandre B.. "Agrégation d’estimateurs et optimisation stochastique." Journal de la société française de statistique 149.1 (2008): 3-26. <http://eudml.org/doc/93474>.
@article{Tsybakov2008,
abstract = {Cet article fait suite à la Conférence Lucien Le Cam que j’ai eu l’honneur de donner lors des XXXVIIèmes Journées de Statistique à Pau, en 2005. Il présente un aperçu de quelques résultats récents sur les méthodes d’agrégation d’estimateurs. Ces méthodes consistent à construire, à partir d’un ensemble de $M$ estimateurs donnés, une combinaison linéaire ou convexe de ces estimateurs avec des poids aléatoires choisis de façon optimale. Nous mettons l’accent sur le lien entre agrégation et optimisation stochastique, ce qui nous permet d’aboutir à de nouvelles procédures recursives d’agrégation très performantes.},
author = {Tsybakov, Alexandre B.},
journal = {Journal de la société française de statistique},
keywords = {aggregation; stochastic optimization; mirror descent; adaptive estimation; optimal rate of aggregation},
language = {fre},
number = {1},
pages = {3-26},
publisher = {Société française de statistique},
title = {Agrégation d’estimateurs et optimisation stochastique},
url = {http://eudml.org/doc/93474},
volume = {149},
year = {2008},
}
TY - JOUR
AU - Tsybakov, Alexandre B.
TI - Agrégation d’estimateurs et optimisation stochastique
JO - Journal de la société française de statistique
PY - 2008
PB - Société française de statistique
VL - 149
IS - 1
SP - 3
EP - 26
AB - Cet article fait suite à la Conférence Lucien Le Cam que j’ai eu l’honneur de donner lors des XXXVIIèmes Journées de Statistique à Pau, en 2005. Il présente un aperçu de quelques résultats récents sur les méthodes d’agrégation d’estimateurs. Ces méthodes consistent à construire, à partir d’un ensemble de $M$ estimateurs donnés, une combinaison linéaire ou convexe de ces estimateurs avec des poids aléatoires choisis de façon optimale. Nous mettons l’accent sur le lien entre agrégation et optimisation stochastique, ce qui nous permet d’aboutir à de nouvelles procédures recursives d’agrégation très performantes.
LA - fre
KW - aggregation; stochastic optimization; mirror descent; adaptive estimation; optimal rate of aggregation
UR - http://eudml.org/doc/93474
ER -
References
top- [1] Audibert J.-Y. (2004). Aggregated estimators and empirical complexity for least square regression. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques 40 : 685-736. Zbl1052.62037MR2096215
- [2] Barron A., Cohen A., Dahmen W. et DeVore R. (2005). Approximation and learning by greedy algorithms. Annals of Statistics, à paraître. Zbl1138.62019
- [3] Ben-Tal A. et Nemirovski A.S. (1999). The conjugate barrier mirror descent method for non-smooth convex optimization. MINERVA Optim. Center Report., Haifa : Faculty of Industrial Engineering and Management, Technion – Israel Institute of Technology. http://iew3.technion.ac.il/Labs/Opt/opt/Pap/CP_MD.pdf
- [4] Birgé L. (2006). Model selection via testing : an alternative to (penalized) maximum likelihood estimators. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques 42 : 273 – 325. Zbl1333.62094MR2219712
- [5] Bickel P.J., Ritov Y. et Zakai A. (2006). Some theory for generalized boosting algorithms. J. Machine Learning Research 7 : 705–732. Zbl1222.68148MR2274384
- [6] Bühlman P. et Yu B. (2005). Sparse boosting. J. Machine Learning Research 7 : 1001–1024. MR2274395
- [7] Bunea F., Tsybakov A.B. et Wegkamp M.H. (2004). Aggregation for regression learning. Preprint LPMA, Universities Paris 6 – Paris 7, n. 948, arXiv :math.ST/0410214 et https://hal.ccsd.cnrs.fr/ccsd-00003205 Zbl1209.62065
- [8] Bunea F., Tsybakov A.B. et Wegkamp M.H. (2006). Aggregation and sparsity via penalized least squares. Proceedings of 19th Annual Conference on Learning Theory (COLT 2006), Lecture Notes in Artificial Intelligence v. 4005 (Lugosi, G. and Simon, H.U., eds.), Springer-Verlag, Berlin-Heidelberg, 379–391. Zbl1143.62319MR2280619
- [9] Bunea F., Tsybakov A.B. et Wegkamp M.H. (2007a). Aggregation for Gaussian regression. Annals of Statistics 35 : 1674-1697. Zbl1209.62065MR2351101
- [10] Bunea F., Tsybakov A.B. et Wegkamp M.H. (2007b). Sparsity oracle inequalities for the Lasso. Electronic Journal of Statistics 1 : 169-194. Zbl1146.62028MR2312149
- [11] Bunea F., Tsybakov A.B. et Wegkamp M.H. (2007c). Sparse density estimation with penalties. Proceedings of 20th Annual Conference on Learning Theory (COLT 2007), Lecture Notes in Artificial Intelligence, v. 4539 (N.H. Bshouty and C. Gentile, eds.), Springer-Verlag, Berlin-Heidelberg, 530-543. Zbl1203.62053MR2397610
- [12] Catoni O. (1999). “Universal” aggregation rules with exact bias bounds. Preprint LPMA, Universités Paris 6 – Paris 7, n.510.
- [13] Catoni O. (2004). Statistical Learning Theory and Stochastic Optimization. Ecole d’Eté de Probabilités de Saint-Flour XXXI - 2001. Lecture Notes in Mathematics, vol. 1851, Springer, New York. Zbl1076.93002MR2163920
- [14] Cesa-Bianchi N. et Lugosi G. (2006). Prediction, Learning, and Games. Cambridge Univ. Press. Zbl1114.91001MR2409394
- [15] Dalalyan A., Juditsky A. et Spokoiny V. (2007). A new algorithm for estimating the effective dimension-reduction subspace. arXiv :math/0701887v1. Zbl1225.62091
- [16] Dalalyan A. et Tsybakov A.B. (2007). Aggregation by exponential weighting and sharp oracle inequalities. Proceedings of the 20th Annual Conference on Learning Theory (COLT-2007), Lecture Notes in Artificial Intelligence, v. 4539 (N.H. Bshouty and C. Gentile, eds.), Springer-Verlag, Berlin-Heidelberg, 97-111. Zbl1203.62063MR2397581
- [17] Freund Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation 121 : 256–285. Zbl0833.68109MR1348530
- [18] Hristache M., Juditsky A. et Spokoiny V. (2001). Direct estimation of the index coefficient in a single-index model. Annals of Statistics 29 : 593–623. Zbl1012.62043MR1865333
- [19] Huber P.J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proc. Fifth Berkeley Symp. Math. Statist. Prob. 1 : 221–234. Zbl0212.21504MR216620
- [20] Juditsky A.B., Nazin A.V., Tsybakov A.B. et Vayatis N. (2005). Recursive aggregation of estimators by the mirror descent algorithm with averaging. Problems of Information Transmission 41 : 368–384. Zbl1123.62044MR2198228
- [21] Juditsky A. et Nemirovski A. (2000). Functional aggregation for nonparametric estimation. Annals of Statistics 28 : 681–712. Zbl1105.62338MR1792783
- [22] Juditsky A., Rigollet Ph. et Tsybakov A.B. (2005). Learning by mirror averaging. Annals of Statistics, à paraître. Zbl1274.62288
- [23] Klemelä J. (2006). Density estimation with stagewise optimization of the empirical risk. Machine Learning 67 : 169–195.
- [24] Koltchinskii V. (2006a). Local Rademacher complexities and oracle inequalities in empirical risk minimization (with discusssion). Annals of Statistics 34 : 2697–2706. Zbl1118.62065MR2329464
- [25] Koltchinskii V. (2006b). Sparsity in penalized empirical risk minimization. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques, à paraître. Zbl1168.62044
- [26] Le Cam L. (1953). On some asymptotic properties of maximum likelihood estimates and related Bayes estimates. University of California Publications in Statistics 1 : 277–329. Zbl0052.15404MR54913
- [27] Lecué G. (2005). Lower bounds and aggregation in density estimation. J. Machine Learning Research 7 : 971–981. Zbl1222.62052MR2274393
- [28] Lecué G. (2007a). Méthodes d’agrégation : optimalité et vitesses rapides. Thèse de doctorat, Université Paris 6. http://tel.archives-ouvertes.fr/tel-00150402
- [29] Lecué G. (2007b). Suboptimality of penalized empirical risk minimization in classification. Proceedings of 20th Annual Conference on Learning Theory (COLT 2007), Lecture Notes in Artificial Intelligence, v. 4539 (N.H. Bshouty and C.Gentile, eds.), Springer-Verlag, Berlin-Heidelberg, 142–156. Zbl1203.68159MR2397584
- [30] Lugosi G. (2006). Prédiction randomisée de suites individuelles. J. Société Française de Statistique 147 : 5–37.
- [31] Lounici K. (2007). Generalized mirror averaging and -convex aggregation. Mathematical Methods of Statistics 16 : 246–259. Zbl1231.62046MR2356820
- [32] Lugosi G. et Vayatis N. (2004). On the Bayes-risk consistency of regularized boosting methods (with discussion). Annals of Statistics 32 : 30–55. Zbl1105.62319MR2051000
- [33] Mannor S., Meir R. et Zhang T. (2003). Greedy algorithms for classification – consistency, convergence rates, and adaptivity. Journal of Machine Learning Research 4 : 713–742. Zbl1105.68388MR2072266
- [34] Nemirovski A. (2000). Topics in Non-parametric Statistics. Ecole d’Eté de Probabilités de Saint-Flour XXVIII - 1998, Lecture Notes in Mathematics, v. 1738, Springer : New York. Zbl0998.62033MR1775640
- [35] Nemirovski A.S. et Yudin D.B. (1983). Problem Complexity and Method Efficiency in Optimization, Wiley, Chichester. Zbl0501.90062MR702836
- [36] Nesterov Yu. (2005). Primal-dual subgradient methods for convex problems. CORE discussion paper 2005/67. Center for Operations Research and Econometrics, Louvain-la-Neuve, Belgique.
- [37] Nesterov Yu. (2007). Primal-dual subgradient methods for convex problems. Mathematical Programming, publié en ligne, DOI : 10.1007/s10107-007-0149-x.
- [38] Rigollet Ph. (2006). Inégalités d’oracle, agrégation et adaptation. Thèse de doctorat, Université Paris 6. http://tel.archives-ouvertes.fr/tel-00115494
- [39] Rigollet Ph. et Tsybakov A. (2007). Linear and convex aggregation of density estimators. Mathematical Methods of Statistics 16 : 260–280. Zbl1231.62057MR2356821
- [40] Robbins H. et Monro S. (1951). A stochastic approximation method. Annals of Mathematical Statistics 22 : 400–407. Zbl0054.05901MR42668
- [41] Rockafellar R.T. et Wets R.J.B. (1998). Variational Analysis. Springer, N.Y. Zbl0888.49001MR1491362
- [42] Samarov A. et Tsybakov A. (2007) Aggregation of density estimators and dimension reduction. Advances in Statistical Modeling and Inference. Essays in Honor of Kjell A. Doksum (V. Nair, ed.), World Scientific, Singapore e.a., 233–251. MR2416118
- [43] Schapire R.E. (1990). The strength of weak learnability. Machine Learning 5 : 197–227. Zbl0747.68058
- [44] Tsybakov A. (2003). Optimal rates of aggregation. Computational Learning Theory and Kernel Machines, (B. Schölkopf and M. Warmuth, eds.), Lecture Notes in Artificial Intelligence, v. 2777. Springer, Heidelberg, 303-313.
- [45] Vajda I. (1986). Theory of Statistical Inference and Information. Kluwer, Dordrecht. Zbl0711.62002
- [46] Van de Geer S.A. (2006). High dimensional generalized linear models and the Lasso. Annals of Statistics, à paraître. Zbl1138.62323
- [47] Vapnik V. et Chervonenkis A. (1974). Theory of Pattern Recognition Nauka, Moscow (in Russian). Traduction allemande : Wapnik W. und Tscherwonenkis A. Theorie der Zeichenerkennung, Berlin : Akademie-Verlag, 1979. MR474638
- [48] Wand M.P. et Jones M.C. (1995). Kernel Smoothing. Chapman and Hall, London. Zbl0854.62043MR1319818
- [49] Wegkamp M.H. (2003). Model selection in nonparametric regression. Annals of Statistics 31 : 252–273. Zbl1019.62037MR1962506
- [50] Yang Y. (2000a). Mixing strategies for density estimation. Annals of Statistics 28 : 75–87. Zbl1106.62322MR1762904
- [51] Yang Y. (2000b). Combining different procedures for adaptive regression. Journal of Multivariate Analysis 74 : 135–161. Zbl0964.62032MR1790617
- [52] Yang Y. (2001). Adaptive regression by mixing. Journal of the American Statistical Association 96 : 574–588. Zbl1018.62033MR1946426
- [53] Yang Y. (2004). Aggregating regression procedures for a better performance. Bernoulli 10 : 25–47. Zbl1040.62030MR2044592
- [54] Zhang T. et Yu B. (2005) Boosting with early stopping : convergence and cosistency. Annals of Statistics 33 : 1538–1579. Zbl1078.62038MR2166555
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.