# A non asymptotic penalized criterion for gaussian mixture model selection

ESAIM: Probability and Statistics (2011)

- Volume: 15, page 41-68
- ISSN: 1292-8100

## Access Full Article

top## Abstract

top## How to cite

topMaugis, Cathy, and Michel, Bertrand. "A non asymptotic penalized criterion for gaussian mixture model selection." ESAIM: Probability and Statistics 15 (2011): 41-68. <http://eudml.org/doc/277155>.

@article{Maugis2011,

abstract = {Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.},

author = {Maugis, Cathy, Michel, Bertrand},

journal = {ESAIM: Probability and Statistics},

keywords = {model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy},

language = {eng},

pages = {41-68},

publisher = {EDP-Sciences},

title = {A non asymptotic penalized criterion for gaussian mixture model selection},

url = {http://eudml.org/doc/277155},

volume = {15},

year = {2011},

}

TY - JOUR

AU - Maugis, Cathy

AU - Michel, Bertrand

TI - A non asymptotic penalized criterion for gaussian mixture model selection

JO - ESAIM: Probability and Statistics

PY - 2011

PB - EDP-Sciences

VL - 15

SP - 41

EP - 68

AB - Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.

LA - eng

KW - model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy

UR - http://eudml.org/doc/277155

ER -

## References

top- [1] H. Akaike, Information theory and an extension of the maximum likelihood principle, in Second International Symposium on Information Theory (Tsahkadsor, 1971), Akadémiai Kiadó, Budapest (1973) 267–281. Zbl0283.62006MR483125
- [2] S. Arlot and P. Massart, Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. (2008) (to appear).
- [3] J.D. Banfield and A.E. Raftery, Model-based Gaussian and non-Gaussian clustering. Biometrics49 (1993) 803–821. Zbl0794.62034MR1243494
- [4] A. Barron, L. Birgé and P. Massart, Risk bounds for model selection via penalization. Prob. Th. Re. Fields113 (1999) 301–413. Zbl0946.62036MR1679028
- [5] J.-P. Baudry, Clustering through model selection criteria. Poster session at One Day Statistical Workshop in Lisieux. http://www.math.u-psud.fr/ baudry, June (2007).
- [6] C. Biernacki, G. Celeux and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Analy. Mach. Intell.22 (2000) 719–725.
- [7] C. Biernacki, G. Celeux, G. Govaert and F. Langrognet, Model-based cluster and discriminant analysis with the mixmod software. Comput. Stat. Data Anal.51 (2006) 587–600. Zbl1157.62431MR2297473
- [8] L. Birgé and P. Massart, Gaussian model selection. J. Eur. Math. Soc.3 (2001) 203–268. Zbl1037.62001MR1848946
- [9] L. Birgé and P. Massart, A generalized Cp criterion for Gaussian model selection. Prépublication n° 647, Universités de Paris 6 et Paris 7 (2001). Zbl1037.62001
- [10] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Prob. Th. Rel. Fields138 (2007) 33–73. Zbl1112.62082MR2288064
- [11] L. Birgé and P. Massart, From model selection to adaptive estimation, in Festschrift for Lucien Le Cam. Springer, New York (1997) 55–87. Zbl0920.62042MR1462939
- [12] C. Bouveyron, S. Girard and C. Schmid, High-Dimensional Data Clustering. Comput. Stat. Data Anal.52 (2007) 502–519. Zbl05560174MR2409998
- [13] K.P. Burnham and D.R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag, New York, 2nd edition (2002). Zbl1005.62007MR1919620
- [14] G. Castellan, Modified Akaike's criterion for histogram density estimation. Technical report, Université Paris-Sud 11 (1999).
- [15] G. Castellan, Density estimation via exponential model selection. IEEE Trans. Inf. Theory49 (2003) 2052–2060. Zbl1288.62054MR2004713
- [16] G. Celeux and G. Govaert, Gaussian parsimonious clustering models. Pattern Recogn.28 (1995) 781–793.
- [17] A.P. Dempster, N.M. Laird and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc, Ser. B. 39 (1977) 1–38. Zbl0364.62022MR501537
- [18] C.R. Genovese and L. Wasserman, Rates of convergence for the Gaussian mixture sieve. Ann. Stat.28 (2000) 1105–1127. Zbl1105.62333MR1810921
- [19] S. Ghosal and A.W. van der Vaart, Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Stat.29 (2001) 1233–1263. Zbl1043.62025MR1873329
- [20] C. Keribin, Consistent estimation of the order of mixture models. Sankhyā. The Indian Journal of Statistics. Series A62 (2000) 49–66. Zbl1081.62516MR1769735
- [21] M.H. Law, M.A.T. Figueiredo and A.K. Jain, Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell.26 (2004) 1154–1166.
- [22] E. Lebarbier, Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Proc.85 (2005) 717–736. Zbl1148.94403
- [23] V. Lepez, Potentiel de réserves d'un bassin pétrolier: modélisation et estimation. Ph.D. thesis, Université Paris-Sud 11 (2002).
- [24] P. Massart, Concentration inequalities and model selection. Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003). Zbl1170.60006MR2319879
- [25] C. Maugis, Sélection de variables pour la classification non supervisée par mélanges gaussiens. Applications à l'étude de données transcriptomes. Ph.D. thesis, University Paris-Sud 11 (2008).
- [26] C. Maugis, G. Celeux and M.-L. Martin-Magniette, Variable Selection for Clustering with Gaussian Mixture Models. Biometrics (2008) (to appear). Zbl1172.62021MR2649842
- [27] C. Maugis and B. Michel, Slope heuristics for variable selection and clustering via Gaussian mixtures. Technical Report 6550, INRIA (2008).
- [28] A.E. Raftery and N. Dean, Variable Selection for Model-Based Clustering. J. Am. Stat. Assoc.101 (2006) 168–178. Zbl1118.62339MR2268036
- [29] G. Schwarz, Estimating the dimension of a model. Ann. Stat.6 (1978) 461–464. Zbl0379.62005MR468014
- [30] D. Serre, Matrices. Springer-Verlag, New York (2002). Zbl1011.15001MR1923507
- [31] M. Talagrand, Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math., Inst. Hautes Étud. Sci. 81 (1995) 73–205. Zbl0864.60013MR1361756
- [32] M. Talagrand, New concentration inequalities in product spaces. Invent. Math.126 (1996) 505–563. Zbl0893.60001MR1419006
- [33] F. Villers, Tests et sélection de modèles pour l'analyse de données protéomiques et transcriptomiques. Ph.D. thesis, University Paris-Sud 11 (2007).

## Citations in EuDML Documents

top- Caroline Meynet, An ℓ1-oracle inequality for the Lasso in finite mixture gaussian regression models
- C. Maugis-Rabusseau, B. Michel, Adaptive density estimation for clustering with gaussian mixtures
- Cathy Maugis, Bertrand Michel, Data-driven penalty calibration: A case study for gaussian mixture model selection
- Cathy Maugis, Bertrand Michel, Data-driven penalty calibration: A case study for Gaussian mixture model selection
- Yannick Baraud, Lucien Birgé, Estimating composite functions by model selection