An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models

Caroline Meynet

ESAIM: Probability and Statistics (2013)

  • Volume: 17, page 650-671
  • ISSN: 1292-8100

Abstract

top
We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].

How to cite

top

Meynet, Caroline. "An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models." ESAIM: Probability and Statistics 17 (2013): 650-671. <http://eudml.org/doc/274367>.

@article{Meynet2013,
abstract = {We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].},
author = {Meynet, Caroline},
journal = {ESAIM: Probability and Statistics},
keywords = {finite mixture of gaussian regressions model; Lasso; ℓ1-oracle inequalities; model selection by penalization; ℓ1-balls; finite mixture of Gaussian regressions model; $\ell _\{1\}$-oracle inequalities; $\ell _\{1\}$-balls},
language = {eng},
pages = {650-671},
publisher = {EDP-Sciences},
title = {An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models},
url = {http://eudml.org/doc/274367},
volume = {17},
year = {2013},
}

TY - JOUR
AU - Meynet, Caroline
TI - An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models
JO - ESAIM: Probability and Statistics
PY - 2013
PB - EDP-Sciences
VL - 17
SP - 650
EP - 671
AB - We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].
LA - eng
KW - finite mixture of gaussian regressions model; Lasso; ℓ1-oracle inequalities; model selection by penalization; ℓ1-balls; finite mixture of Gaussian regressions model; $\ell _{1}$-oracle inequalities; $\ell _{1}$-balls
UR - http://eudml.org/doc/274367
ER -

References

top
  1. [1] P.L. Bartlett, S. Mendelson and J. Neeman, ℓ1-regularized linear regression: persistence and oracle inequalities, Probability and related fields. Springer (2011). Zbl06125014
  2. [2] J.P. Baudry, Sélection de Modèle pour la Classification Non Supervisée. Choix du Nombre de Classes. Ph.D. thesis, Université Paris-Sud 11, France (2009). 
  3. [3] P.J. Bickel, Y. Ritov and A.B. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat.37 (2009) 1705–1732. Zbl1173.62022MR2533469
  4. [4] S. Boucheron, G. Lugosi and P. Massart, A non Asymptotic Theory of Independence. Oxford University press (2013). Zbl1279.60005MR3185193
  5. [5] P. Bühlmann and S. van de Geer, On the conditions used to prove oracle results for the Lasso. Electr. J. Stat.3 (2009) 1360–1392. Zbl1327.62425MR2576316
  6. [6] E. Candes and T. Tao, The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat.35 (2007) 2313–2351. Zbl1139.62019MR2382644
  7. [7] S. Cohen and E. Le Pennec, Conditional Density Estimation by Penalized Likelihood Model Selection and Applications, RR-7596. INRIA (2011). 
  8. [8] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least Angle Regression. Ann. Stat.32 (2004) 407–499. Zbl1091.62054MR2060166
  9. [9] M. Hebiri, Quelques questions de sélection de variables autour de l’estimateur Lasso. Ph.D. Thesis, Université Paris Diderot, Paris 7, France (2009). 
  10. [10] C. Huang, G.H.L. Cheang and A.R. Barron, Risk of penalized least squares, greedy selection and ℓ1-penalization for flexible function librairies. Submitted to the Annals of Statistics (2008). MR2711791
  11. [11] P. Massart, Concentration inequalities and model selection. Ecole d’été de Probabilités de Saint-Flour 2003. Lect. Notes Math. Springer, Berlin-Heidelberg (2007). Zbl1170.60006MR2319879
  12. [12] P. Massart and C. Meynet, The Lasso as an ℓ1-ball model selection procedure. Elect. J. Stat.5 (2011) 669–687. Zbl1274.62468MR2820635
  13. [13] C. Maugis and B. Michel, A non asymptotic penalized criterion for Gaussian mixture model selection. ESAIM: PS 15 (2011) 41–68. Zbl06157507MR2870505
  14. [14] G. McLachlan and D. Peel, Finite Mixture Models. Wiley, New York (2000). Zbl0963.62061MR1789474
  15. [15] N. Meinshausen and B. Yu, Lasso type recovery of sparse representations for high dimensional data. Ann. Stat.37 (2009) 246–270. Zbl1155.62050MR2488351
  16. [16] R.A. Redner and H.F. Walker, Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev.26 (1984) 195–239. Zbl0536.62021MR738930
  17. [17] P. Rigollet and A. Tsybakov, Exponential screening and optimal rates of sparse estimation. Ann. Stat.39 (2011) 731–771. Zbl1215.62043MR2816337
  18. [18] N. Städler, B.P. Hlmann, and S. van de Geer, ℓ1-penalization for mixture regression models. Test19 (2010) 209–256. Zbl1203.62128
  19. [19] R. Tibshirani, Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. Ser. B58 (1996) 267–288. Zbl0850.62538MR1379242
  20. [20] M.R. Osborne, B. Presnell and B.A. Turlach, On the Lasso and its dual. J. Comput. Graph. Stat.9 (2000) 319–337. MR1822089
  21. [21] M.R. Osborne, B. Presnell and B.A Turlach, A new approach to variable selection in least squares problems. IMA J. Numer. Anal.20 (2000) 389–404. Zbl0962.65036MR1773265
  22. [22] A. van der Vaart and J. Wellner, Weak Convergence and Empirical Processes. Springer, Berlin (1996). Zbl0862.60002MR1385671
  23. [23] V.N. Vapnik, Estimation of Dependencies Based on Empirical Data. Springer, New-York (1982). Zbl0499.62005MR672244
  24. [24] V.N. Vapnik, Statistical Learning Theory. J. Wiley, New-York (1990). Zbl0935.62007MR1641250
  25. [25] P. Zhao and B. YuOn model selection consistency of Lasso. J. Mach. Learn. Res.7 (2006) 2541–2563. Zbl1222.62008MR2274449

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.