A non asymptotic penalized criterion for gaussian mixture model selection

Cathy Maugis; Bertrand Michel

ESAIM: Probability and Statistics (2011)

  • Volume: 15, page 41-68
  • ISSN: 1292-8100

Abstract

top
Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.

How to cite

top

Maugis, Cathy, and Michel, Bertrand. "A non asymptotic penalized criterion for gaussian mixture model selection." ESAIM: Probability and Statistics 15 (2011): 41-68. <http://eudml.org/doc/277155>.

@article{Maugis2011,
abstract = {Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.},
author = {Maugis, Cathy, Michel, Bertrand},
journal = {ESAIM: Probability and Statistics},
keywords = {model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy},
language = {eng},
pages = {41-68},
publisher = {EDP-Sciences},
title = {A non asymptotic penalized criterion for gaussian mixture model selection},
url = {http://eudml.org/doc/277155},
volume = {15},
year = {2011},
}

TY - JOUR
AU - Maugis, Cathy
AU - Michel, Bertrand
TI - A non asymptotic penalized criterion for gaussian mixture model selection
JO - ESAIM: Probability and Statistics
PY - 2011
PB - EDP-Sciences
VL - 15
SP - 41
EP - 68
AB - Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.
LA - eng
KW - model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy
UR - http://eudml.org/doc/277155
ER -

References

top
  1. [1] H. Akaike, Information theory and an extension of the maximum likelihood principle, in Second International Symposium on Information Theory (Tsahkadsor, 1971), Akadémiai Kiadó, Budapest (1973) 267–281. Zbl0283.62006MR483125
  2. [2] S. Arlot and P. Massart, Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. (2008) (to appear). 
  3. [3] J.D. Banfield and A.E. Raftery, Model-based Gaussian and non-Gaussian clustering. Biometrics49 (1993) 803–821. Zbl0794.62034MR1243494
  4. [4] A. Barron, L. Birgé and P. Massart, Risk bounds for model selection via penalization. Prob. Th. Re. Fields113 (1999) 301–413. Zbl0946.62036MR1679028
  5. [5] J.-P. Baudry, Clustering through model selection criteria. Poster session at One Day Statistical Workshop in Lisieux. http://www.math.u-psud.fr/ baudry, June (2007). 
  6. [6] C. Biernacki, G. Celeux and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Analy. Mach. Intell.22 (2000) 719–725. 
  7. [7] C. Biernacki, G. Celeux, G. Govaert and F. Langrognet, Model-based cluster and discriminant analysis with the mixmod software. Comput. Stat. Data Anal.51 (2006) 587–600. Zbl1157.62431MR2297473
  8. [8] L. Birgé and P. Massart, Gaussian model selection. J. Eur. Math. Soc.3 (2001) 203–268. Zbl1037.62001MR1848946
  9. [9] L. Birgé and P. Massart, A generalized Cp criterion for Gaussian model selection. Prépublication n° 647, Universités de Paris 6 et Paris 7 (2001). Zbl1037.62001
  10. [10] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Prob. Th. Rel. Fields138 (2007) 33–73. Zbl1112.62082MR2288064
  11. [11] L. Birgé and P. Massart, From model selection to adaptive estimation, in Festschrift for Lucien Le Cam. Springer, New York (1997) 55–87. Zbl0920.62042MR1462939
  12. [12] C. Bouveyron, S. Girard and C. Schmid, High-Dimensional Data Clustering. Comput. Stat. Data Anal.52 (2007) 502–519. Zbl05560174MR2409998
  13. [13] K.P. Burnham and D.R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag, New York, 2nd edition (2002). Zbl1005.62007MR1919620
  14. [14] G. Castellan, Modified Akaike's criterion for histogram density estimation. Technical report, Université Paris-Sud 11 (1999). 
  15. [15] G. Castellan, Density estimation via exponential model selection. IEEE Trans. Inf. Theory49 (2003) 2052–2060. Zbl1288.62054MR2004713
  16. [16] G. Celeux and G. Govaert, Gaussian parsimonious clustering models. Pattern Recogn.28 (1995) 781–793. 
  17. [17] A.P. Dempster, N.M. Laird and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc, Ser. B. 39 (1977) 1–38. Zbl0364.62022MR501537
  18. [18] C.R. Genovese and L. Wasserman, Rates of convergence for the Gaussian mixture sieve. Ann. Stat.28 (2000) 1105–1127. Zbl1105.62333MR1810921
  19. [19] S. Ghosal and A.W. van der Vaart, Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Stat.29 (2001) 1233–1263. Zbl1043.62025MR1873329
  20. [20] C. Keribin, Consistent estimation of the order of mixture models. Sankhyā. The Indian Journal of Statistics. Series A62 (2000) 49–66. Zbl1081.62516MR1769735
  21. [21] M.H. Law, M.A.T. Figueiredo and A.K. Jain, Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell.26 (2004) 1154–1166. 
  22. [22] E. Lebarbier, Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Proc.85 (2005) 717–736. Zbl1148.94403
  23. [23] V. Lepez, Potentiel de réserves d'un bassin pétrolier: modélisation et estimation. Ph.D. thesis, Université Paris-Sud 11 (2002). 
  24. [24] P. Massart, Concentration inequalities and model selection. Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003). Zbl1170.60006MR2319879
  25. [25] C. Maugis, Sélection de variables pour la classification non supervisée par mélanges gaussiens. Applications à l'étude de données transcriptomes. Ph.D. thesis, University Paris-Sud 11 (2008). 
  26. [26] C. Maugis, G. Celeux and M.-L. Martin-Magniette, Variable Selection for Clustering with Gaussian Mixture Models. Biometrics (2008) (to appear). Zbl1172.62021MR2649842
  27. [27] C. Maugis and B. Michel, Slope heuristics for variable selection and clustering via Gaussian mixtures. Technical Report 6550, INRIA (2008). 
  28. [28] A.E. Raftery and N. Dean, Variable Selection for Model-Based Clustering. J. Am. Stat. Assoc.101 (2006) 168–178. Zbl1118.62339MR2268036
  29. [29] G. Schwarz, Estimating the dimension of a model. Ann. Stat.6 (1978) 461–464. Zbl0379.62005MR468014
  30. [30] D. Serre, Matrices. Springer-Verlag, New York (2002). Zbl1011.15001MR1923507
  31. [31] M. Talagrand, Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math., Inst. Hautes Étud. Sci. 81 (1995) 73–205. Zbl0864.60013MR1361756
  32. [32] M. Talagrand, New concentration inequalities in product spaces. Invent. Math.126 (1996) 505–563. Zbl0893.60001MR1419006
  33. [33] F. Villers, Tests et sélection de modèles pour l'analyse de données protéomiques et transcriptomiques. Ph.D. thesis, University Paris-Sud 11 (2007). 

Citations in EuDML Documents

top
  1. Caroline Meynet, An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models
  2. C. Maugis-Rabusseau, B. Michel, Adaptive density estimation for clustering with gaussian mixtures
  3. Cathy Maugis, Bertrand Michel, Data-driven penalty calibration: A case study for gaussian mixture model selection
  4. Cathy Maugis, Bertrand Michel, Data-driven penalty calibration: A case study for Gaussian mixture model selection
  5. Yannick Baraud, Lucien Birgé, Estimating composite functions by model selection

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.