An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models

Caroline Meynet

An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models

Caroline Meynet

ESAIM: Probability and Statistics (2013)

Volume: 17, page 650-671
ISSN: 1292-8100

Access Full Article

top

Access to full text

Abstract

top

We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].

How to cite

top

MLA
BibTeX
RIS

Meynet, Caroline. "An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models." ESAIM: Probability and Statistics 17 (2013): 650-671. <http://eudml.org/doc/274367>.

@article{Meynet2013,
abstract = {We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].},
author = {Meynet, Caroline},
journal = {ESAIM: Probability and Statistics},
keywords = {finite mixture of gaussian regressions model; Lasso; ℓ1-oracle inequalities; model selection by penalization; ℓ1-balls; finite mixture of Gaussian regressions model; $\ell _\{1\}$-oracle inequalities; $\ell _\{1\}$-balls},
language = {eng},
pages = {650-671},
publisher = {EDP-Sciences},
title = {An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models},
url = {http://eudml.org/doc/274367},
volume = {17},
year = {2013},
}

TY - JOUR
AU - Meynet, Caroline
TI - An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models
JO - ESAIM: Probability and Statistics
PY - 2013
PB - EDP-Sciences
VL - 17
SP - 650
EP - 671
AB - We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].
LA - eng
KW - finite mixture of gaussian regressions model; Lasso; ℓ1-oracle inequalities; model selection by penalization; ℓ1-balls; finite mixture of Gaussian regressions model; $\ell _{1}$-oracle inequalities; $\ell _{1}$-balls
UR - http://eudml.org/doc/274367
ER -

References

top

[1] P.L. Bartlett, S. Mendelson and J. Neeman, ℓ1-regularized linear regression: persistence and oracle inequalities, Probability and related fields. Springer (2011). Zbl06125014
[2] J.P. Baudry, Sélection de Modèle pour la Classification Non Supervisée. Choix du Nombre de Classes. Ph.D. thesis, Université Paris-Sud 11, France (2009).
[3] P.J. Bickel, Y. Ritov and A.B. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat.37 (2009) 1705–1732. Zbl1173.62022 MR2533469
[4] S. Boucheron, G. Lugosi and P. Massart, A non Asymptotic Theory of Independence. Oxford University press (2013). Zbl1279.60005 MR3185193
[5] P. Bühlmann and S. van de Geer, On the conditions used to prove oracle results for the Lasso. Electr. J. Stat.3 (2009) 1360–1392. Zbl1327.62425 MR2576316
[6] E. Candes and T. Tao, The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat.35 (2007) 2313–2351. Zbl1139.62019 MR2382644
[7] S. Cohen and E. Le Pennec, Conditional Density Estimation by Penalized Likelihood Model Selection and Applications, RR-7596. INRIA (2011).
[8] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least Angle Regression. Ann. Stat.32 (2004) 407–499. Zbl1091.62054 MR2060166
[9] M. Hebiri, Quelques questions de sélection de variables autour de l’estimateur Lasso. Ph.D. Thesis, Université Paris Diderot, Paris 7, France (2009).
[10] C. Huang, G.H.L. Cheang and A.R. Barron, Risk of penalized least squares, greedy selection and ℓ1-penalization for flexible function librairies. Submitted to the Annals of Statistics (2008). MR2711791
[11] P. Massart, Concentration inequalities and model selection. Ecole d’été de Probabilités de Saint-Flour 2003. Lect. Notes Math. Springer, Berlin-Heidelberg (2007). Zbl1170.60006 MR2319879
[12] P. Massart and C. Meynet, The Lasso as an ℓ1-ball model selection procedure. Elect. J. Stat.5 (2011) 669–687. Zbl1274.62468 MR2820635
[13] C. Maugis and B. Michel, A non asymptotic penalized criterion for Gaussian mixture model selection. ESAIM: PS 15 (2011) 41–68. Zbl06157507 MR2870505
[14] G. McLachlan and D. Peel, Finite Mixture Models. Wiley, New York (2000). Zbl0963.62061 MR1789474
[15] N. Meinshausen and B. Yu, Lasso type recovery of sparse representations for high dimensional data. Ann. Stat.37 (2009) 246–270. Zbl1155.62050 MR2488351
[16] R.A. Redner and H.F. Walker, Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev.26 (1984) 195–239. Zbl0536.62021 MR738930
[17] P. Rigollet and A. Tsybakov, Exponential screening and optimal rates of sparse estimation. Ann. Stat.39 (2011) 731–771. Zbl1215.62043 MR2816337
[18] N. Städler, B.P. Hlmann, and S. van de Geer, ℓ1-penalization for mixture regression models. Test19 (2010) 209–256. Zbl1203.62128
[19] R. Tibshirani, Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. Ser. B58 (1996) 267–288. Zbl0850.62538 MR1379242
[20] M.R. Osborne, B. Presnell and B.A. Turlach, On the Lasso and its dual. J. Comput. Graph. Stat.9 (2000) 319–337. MR1822089
[21] M.R. Osborne, B. Presnell and B.A Turlach, A new approach to variable selection in least squares problems. IMA J. Numer. Anal.20 (2000) 389–404. Zbl0962.65036 MR1773265
[22] A. van der Vaart and J. Wellner, Weak Convergence and Empirical Processes. Springer, Berlin (1996). Zbl0862.60002 MR1385671
[23] V.N. Vapnik, Estimation of Dependencies Based on Empirical Data. Springer, New-York (1982). Zbl0499.62005 MR672244
[24] V.N. Vapnik, Statistical Learning Theory. J. Wiley, New-York (1990). Zbl0935.62007 MR1641250
[25] P. Zhao and B. YuOn model selection consistency of Lasso. J. Mach. Learn. Res.7 (2006) 2541–2563. Zbl1222.62008 MR2274449

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.