High-dimensional gaussian model selection on a gaussian design

Nicolas Verzelen

Annales de l'I.H.P. Probabilités et statistiques (2010)

  • Volume: 46, Issue: 2, page 480-524
  • ISSN: 0246-0203

Abstract

top
We consider the problem of estimating the conditional mean of a real gaussian variable Y=∑i=1pθiXi+ɛ where the vector of the covariates (Xi)1≤i≤p follows a joint gaussian distribution. This issue often occurs when one aims at estimating the graph or the distribution of a gaussian graphical model. We introduce a general model selection procedure which is based on the minimization of a penalized least squares type criterion. It handles a variety of problems such as ordered and complete variable selection, allows to incorporate some prior knowledge on the model and applies when the number of covariates p is larger than the number of observations n. Moreover, it is shown to achieve a non-asymptotic oracle inequality independently of the correlation structure of the covariates. We also exhibit various minimax rates of estimation in the considered framework and hence derive adaptivity properties of our procedure.

How to cite

top

Verzelen, Nicolas. "High-dimensional gaussian model selection on a gaussian design." Annales de l'I.H.P. Probabilités et statistiques 46.2 (2010): 480-524. <http://eudml.org/doc/242316>.

@article{Verzelen2010,
abstract = {We consider the problem of estimating the conditional mean of a real gaussian variable Y=∑i=1pθiXi+ɛ where the vector of the covariates (Xi)1≤i≤p follows a joint gaussian distribution. This issue often occurs when one aims at estimating the graph or the distribution of a gaussian graphical model. We introduce a general model selection procedure which is based on the minimization of a penalized least squares type criterion. It handles a variety of problems such as ordered and complete variable selection, allows to incorporate some prior knowledge on the model and applies when the number of covariates p is larger than the number of observations n. Moreover, it is shown to achieve a non-asymptotic oracle inequality independently of the correlation structure of the covariates. We also exhibit various minimax rates of estimation in the considered framework and hence derive adaptivity properties of our procedure.},
author = {Verzelen, Nicolas},
journal = {Annales de l'I.H.P. Probabilités et statistiques},
keywords = {model selection; linear regression; oracle inequalities; gaussian graphical models; minimax rates of estimation; Gaussian graphical models},
language = {eng},
number = {2},
pages = {480-524},
publisher = {Gauthier-Villars},
title = {High-dimensional gaussian model selection on a gaussian design},
url = {http://eudml.org/doc/242316},
volume = {46},
year = {2010},
}

TY - JOUR
AU - Verzelen, Nicolas
TI - High-dimensional gaussian model selection on a gaussian design
JO - Annales de l'I.H.P. Probabilités et statistiques
PY - 2010
PB - Gauthier-Villars
VL - 46
IS - 2
SP - 480
EP - 524
AB - We consider the problem of estimating the conditional mean of a real gaussian variable Y=∑i=1pθiXi+ɛ where the vector of the covariates (Xi)1≤i≤p follows a joint gaussian distribution. This issue often occurs when one aims at estimating the graph or the distribution of a gaussian graphical model. We introduce a general model selection procedure which is based on the minimization of a penalized least squares type criterion. It handles a variety of problems such as ordered and complete variable selection, allows to incorporate some prior knowledge on the model and applies when the number of covariates p is larger than the number of observations n. Moreover, it is shown to achieve a non-asymptotic oracle inequality independently of the correlation structure of the covariates. We also exhibit various minimax rates of estimation in the considered framework and hence derive adaptivity properties of our procedure.
LA - eng
KW - model selection; linear regression; oracle inequalities; gaussian graphical models; minimax rates of estimation; Gaussian graphical models
UR - http://eudml.org/doc/242316
ER -

References

top
  1. [1] H. Akaike. Statistical predictor identification. Ann. Inst. Statist. Math. 22 (1970) 203–217. Zbl0259.62076MR286233
  2. [2] H. Akaike. A new look at the statistical model identification. IEEE Trans. Automat. Control 19 (1974) 716–723. Zbl0314.62039MR423716
  3. [3] S. Arlot. Model selection by resampling penalization. Electron. J. Stat. 3 (2009) 557–624. Zbl1326.62097MR2519533
  4. [4] Y. Baraud, C. Giraud and S. Huet. Gaussian model selection with an unknown variance. Ann. Statist. 37 (2009) 630–672. Zbl1162.62051MR2502646
  5. [5] P. Bickel, Y. Ritov and A. Tsybakov. Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705–1732. Zbl1173.62022MR2533469
  6. [6] L. Birgé. A new lower bound for multiple hypothesis testing. IEEE Trans. Inform. Theory 51 (2005) 1611–1615. Zbl1283.62030MR2241522
  7. [7] L. Birgé and P. Massart. Minimum contrast estimators on sieves: Exponential bounds and rates of convergence. Bernoulli 4 (1998) 329–375. Zbl0954.62033MR1653272
  8. [8] L. Birgé and P. Massart. Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 (2001) 203–268. Zbl1037.62001MR1848946
  9. [9] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related Fields 138 (2007) 33–73. Zbl1112.62082MR2288064
  10. [10] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation for Gaussian regression. Ann. Statist. 35 (2007) 1674–1697. Zbl1209.62065MR2351101
  11. [11] F. Bunea, A. Tsybakov and M. Wegkamp. Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1 (2007) 169–194 (electronic). Zbl1146.62028MR2312149
  12. [12] E. J. Candes and T. Tao. Decoding by linear programming. IEEE Trans. Inform. Theory 51 (2005) 4203–4215. Zbl1264.94121MR2243152
  13. [13] E. Candes and T. Tao. The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 (2007) 2313–2351. Zbl1139.62019MR2382644
  14. [14] E. Candès and Y. Plan. Near-ideal model selection by l1 minimization. Ann. Statist. To appear, 2009. Zbl1173.62053MR2543688
  15. [15] R. G. Cowell, A. P. Dawid, S. L. Lauritzen and D. J. Spiegelhalter. Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science. Springer, New York, 1999. Zbl1120.68444MR1697175
  16. [16] N. A. C. Cressie. Statistics for Spatial Data. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Wiley, New York, 1993. (Revised reprint of the 1991 edition, Wiley.) Zbl0799.62002MR1239641
  17. [17] K. R. Davidson and S. J. Szarek. Local operator theory, random matrices and Banach spaces. In Handbook of the Geometry of Banach Spaces, Vol. I 317–366. North-Holland, Amsterdam, 2001. Zbl1067.46008MR1863696
  18. [18] C. Giraud. Estimation of Gaussian graphs by model selection. Electron. J. Stat. 2 (2008) 542–563. Zbl1320.62094MR2417393
  19. [19] T. Gneiting. Power-law correlations, related models for long-range dependence and their simulation. J. Appl. Probab. 37 (2000) 1104–1109. Zbl0972.62079MR1808873
  20. [20] M. Kalisch and P. Bühlmann. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8 (2007) 613–636. Zbl1222.68229
  21. [21] B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. Ann. Statist. 28 (2000) 1302–1338. Zbl1105.62328MR1805785
  22. [22] S. L. Lauritzen. Graphical Models. Oxford Statistical Science Series 17. The Clarendon Press, Oxford University Press, New York, 1996. Zbl0907.62001MR1419991
  23. [23] C. L. Mallows. Some comments on Cp. Technometrics 15 (1973) 661–675. Zbl0269.62061
  24. [24] P. Massart. Concentration Inequalities and Model Selection. Lecture Notes in Mathematics 1896. Springer, Berlin, 2007. (Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, with a foreword by Jean Picard.) Zbl1170.60006MR2319879
  25. [25] N. Meinshausen and P. Bühlmann. High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436–1462. Zbl1113.62082MR2278363
  26. [26] V. H. de la Peña and E. Giné. Decoupling. Probability and Its Applications. Springer, New York, 1999. (From dependence to independence, randomly stopped processes. U-statistics and processes. Martingales and beyond.) Zbl0918.60021MR1666908
  27. [27] D. von Rosen. Moments for the inverted Wishart distribution. Scand. J. Statist. 15 (1988) 97–109. Zbl0663.62063MR968156
  28. [28] H. Rue and L. Held. Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability 104. Chapman & Hall/CRC, London, 2005. Zbl1093.60003MR2130347
  29. [29] K. Sachs, O. Perez, D. Pe’er, D. A. Lauffenburger and G. P. Nolan. Causal protein-signaling networks derived from multiparameter single-cell data. Science 308 (2005) 523–529. 
  30. [30] J. Schäfer and K. Strimmer. An empirical Bayes approach to inferring large-scale gene association network. Bioinformatics 21 (2005) 754–764. 
  31. [31] G. Schwarz. Estimating the dimension of a model. Ann. Statist. 6 (1978) 461–464. Zbl0379.62005MR468014
  32. [32] R. Shibata. An optimal selection of regression variables. Biometrika 68 (1981) 45–54. Zbl0464.62054MR614940
  33. [33] C. Stone. An asymptotically optimal histogram selection rule. In Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, Calif., 1983) 513–520. Wadsworth Statist./Probab. Ser. Wadsworth, Belmont, CA, 1985. MR822050
  34. [34] R. Tibshirani. Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 (1996) 267–288. Zbl0850.62538MR1379242
  35. [35] A. Tsybakov. Optimal rates of aggregation. In 16th Annual Conference on Learning Theory 2777 303–313. Springer, Heidelberg, 2003. Zbl1208.62073
  36. [36] N. Verzelen and F. Villers. Goodness-of-fit tests for high-dimensional Gaussian linear models. Ann. Statist. To appear, 2009. Zbl1183.62074MR2604699
  37. [37] M. J. Wainwright. Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting. Technical Report 725, Department of Statistics, UC Berkeley, 2007. 
  38. [38] A. Wille, P. Zimmermann, E. Vranova, A. Fürholz, O. Laule, S. Bleuler, L. Hennig, A. Prelic, P. von Rohr, L. Thiele, E. Zitzler, W. Gruissem and P. Bühlmann. Sparse graphical Gaussian modelling of the isoprenoid gene network in arabidopsis thaliana. Genome Biology 5 (2004), no. R92. 
  39. [39] P. Zhao and B. Yu. On model selection consistency of Lasso. J. Mach. Learn. Res. 7 (2006) 2541–2563. Zbl1222.62008MR2274449
  40. [40] H. Zou. The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418–1429. Zbl1171.62326MR2279469

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.