A non asymptotic penalized criterion for Gaussian mixture model selection

Cathy Maugis; Bertrand Michel

ESAIM: Probability and Statistics (2012)

  • Volume: 15, page 41-68
  • ISSN: 1292-8100

Abstract

top
Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.

How to cite

top

Maugis, Cathy, and Michel, Bertrand. "A non asymptotic penalized criterion for Gaussian mixture model selection." ESAIM: Probability and Statistics 15 (2012): 41-68. <http://eudml.org/doc/222454>.

@article{Maugis2012,
abstract = { Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper. },
author = {Maugis, Cathy, Michel, Bertrand},
journal = {ESAIM: Probability and Statistics},
keywords = {Model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy; model-based clustering; penalized likelihood criterion},
language = {eng},
month = {1},
pages = {41-68},
publisher = {EDP Sciences},
title = {A non asymptotic penalized criterion for Gaussian mixture model selection},
url = {http://eudml.org/doc/222454},
volume = {15},
year = {2012},
}

TY - JOUR
AU - Maugis, Cathy
AU - Michel, Bertrand
TI - A non asymptotic penalized criterion for Gaussian mixture model selection
JO - ESAIM: Probability and Statistics
DA - 2012/1//
PB - EDP Sciences
VL - 15
SP - 41
EP - 68
AB - Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.
LA - eng
KW - Model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy; model-based clustering; penalized likelihood criterion
UR - http://eudml.org/doc/222454
ER -

References

top
  1. H. Akaike, Information theory and an extension of the maximum likelihood principle, in Second International Symposium on Information Theory (Tsahkadsor, 1971), Akadémiai Kiadó, Budapest (1973) 267–281.  
  2. S. Arlot and P. Massart, Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. (2008) (to appear).  
  3. J.D. Banfield and A.E. Raftery, Model-based Gaussian and non-Gaussian clustering. Biometrics49 (1993) 803–821.  Zbl0794.62034
  4. A. Barron, L. Birgé and P. Massart, Risk bounds for model selection via penalization. Prob. Th. Re. Fields113 (1999) 301–413.  Zbl0946.62036
  5. J.-P. Baudry, Clustering through model selection criteria. Poster session at One Day Statistical Workshop in Lisieux. baudry, June (2007).  URIhttp://www.math.u-psud.fr/
  6. C. Biernacki, G. Celeux and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Analy. Mach. Intell.22 (2000) 719–725.  
  7. C. Biernacki, G. Celeux, G. Govaert and F. Langrognet, Model-based cluster and discriminant analysis with the mixmod software. Comput. Stat. Data Anal.51 (2006) 587–600.  Zbl1157.62431
  8. L. Birgé and P. Massart, Gaussian model selection. J. Eur. Math. Soc.3 (2001) 203–268.  Zbl1037.62001
  9. L. Birgé and P. Massart, A generalized Cp criterion for Gaussian model selection. Prépublication n° 647, Universités de Paris 6 et Paris 7 (2001).  Zbl1037.62001
  10. L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Prob. Th. Rel. Fields138 (2007) 33–73.  Zbl1112.62082
  11. L. Birgé and P. Massart, From model selection to adaptive estimation, in Festschrift for Lucien Le Cam. Springer, New York (1997) 55–87.  Zbl0920.62042
  12. C. Bouveyron, S. Girard and C. Schmid, High-Dimensional Data Clustering. Comput. Stat. Data Anal.52 (2007) 502–519.  Zbl05560174
  13. K.P. Burnham and D.R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag, New York, 2nd edition (2002).  Zbl1005.62007
  14. G. Castellan, Modified Akaike's criterion for histogram density estimation. Technical report, Université Paris-Sud 11 (1999).  
  15. G. Castellan, Density estimation via exponential model selection. IEEE Trans. Inf. Theory49 (2003) 2052–2060.  Zbl1288.62054
  16. G. Celeux and G. Govaert, Gaussian parsimonious clustering models. Pattern Recogn.28 (1995) 781–793.  
  17. A.P. Dempster, N.M. Laird and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc, Ser. B.39 (1977) 1–38.  Zbl0364.62022
  18. C.R. Genovese and L. Wasserman, Rates of convergence for the Gaussian mixture sieve. Ann. Stat.28 (2000) 1105–1127.  Zbl1105.62333
  19. S. Ghosal and A.W. van der Vaart, Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Stat.29 (2001) 1233–1263.  Zbl1043.62025
  20. C. Keribin, Consistent estimation of the order of mixture models. Sankhyā. The Indian Journal of Statistics. Series A62 (2000) 49–66.  Zbl1081.62516
  21. M.H. Law, M.A.T. Figueiredo and A.K. Jain, Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell.26 (2004) 1154–1166.  
  22. E. Lebarbier, Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Proc.85 (2005) 717–736.  Zbl1148.94403
  23. V. Lepez, Potentiel de réserves d'un bassin pétrolier: modélisation et estimation. Ph.D. thesis, Université Paris-Sud 11 (2002).  
  24. P. Massart, Concentration inequalities and model selection. Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003).  
  25. C. Maugis, Sélection de variables pour la classification non supervisée par mélanges gaussiens. Applications à l'étude de données transcriptomes. Ph.D. thesis, University Paris-Sud 11 (2008).  
  26. C. Maugis, G. Celeux and M.-L. Martin-Magniette, Variable Selection for Clustering with Gaussian Mixture Models. Biometrics (2008) (to appear).  Zbl1172.62021
  27. C. Maugis and B. Michel, Slope heuristics for variable selection and clustering via Gaussian mixtures. Technical Report 6550, INRIA (2008).  
  28. A.E. Raftery and N. Dean, Variable Selection for Model-Based Clustering. J. Am. Stat. Assoc.101 (2006) 168–178.  Zbl1118.62339
  29. G. Schwarz, Estimating the dimension of a model. Ann. Stat.6 (1978) 461–464.  Zbl0379.62005
  30. D. Serre, Matrices. Springer-Verlag, New York (2002).  
  31. M. Talagrand, Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math., Inst. Hautes Étud. Sci.81 (1995) 73–205.  Zbl0864.60013
  32. M. Talagrand, New concentration inequalities in product spaces. Invent. Math.126 (1996) 505–563.  Zbl0893.60001
  33. F. Villers, Tests et sélection de modèles pour l'analyse de données protéomiques et transcriptomiques. Ph.D. thesis, University Paris-Sud 11 (2007).  

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.