A non asymptotic penalized criterion for Gaussian mixture model selection

Cathy Maugis; Bertrand Michel

A non asymptotic penalized criterion for Gaussian mixture model selection

Cathy Maugis; Bertrand Michel

ESAIM: Probability and Statistics (2012)

Volume: 15, page 41-68
ISSN: 1292-8100

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.

How to cite

top

MLA
BibTeX
RIS

Maugis, Cathy, and Michel, Bertrand. "A non asymptotic penalized criterion for Gaussian mixture model selection." ESAIM: Probability and Statistics 15 (2012): 41-68. <http://eudml.org/doc/222454>.

@article{Maugis2012,
abstract = { Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper. },
author = {Maugis, Cathy, Michel, Bertrand},
journal = {ESAIM: Probability and Statistics},
keywords = {Model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy; model-based clustering; penalized likelihood criterion},
language = {eng},
month = {1},
pages = {41-68},
publisher = {EDP Sciences},
title = {A non asymptotic penalized criterion for Gaussian mixture model selection},
url = {http://eudml.org/doc/222454},
volume = {15},
year = {2012},
}

TY - JOUR
AU - Maugis, Cathy
AU - Michel, Bertrand
TI - A non asymptotic penalized criterion for Gaussian mixture model selection
JO - ESAIM: Probability and Statistics
DA - 2012/1//
PB - EDP Sciences
VL - 15
SP - 41
EP - 68
AB - Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.
LA - eng
KW - Model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy; model-based clustering; penalized likelihood criterion
UR - http://eudml.org/doc/222454
ER -

References

top

H. Akaike, Information theory and an extension of the maximum likelihood principle, in Second International Symposium on Information Theory (Tsahkadsor, 1971), Akadémiai Kiadó, Budapest (1973) 267–281.
S. Arlot and P. Massart, Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. (2008) (to appear).
J.D. Banfield and A.E. Raftery, Model-based Gaussian and non-Gaussian clustering. Biometrics49 (1993) 803–821.
A. Barron, L. Birgé and P. Massart, Risk bounds for model selection via penalization. Prob. Th. Re. Fields113 (1999) 301–413.
J.-P. Baudry, Clustering through model selection criteria. Poster session at One Day Statistical Workshop in Lisieux. baudry, June (2007). URIhttp://www.math.u-psud.fr/
C. Biernacki, G. Celeux and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Analy. Mach. Intell.22 (2000) 719–725.
C. Biernacki, G. Celeux, G. Govaert and F. Langrognet, Model-based cluster and discriminant analysis with the mixmod software. Comput. Stat. Data Anal.51 (2006) 587–600.
L. Birgé and P. Massart, Gaussian model selection. J. Eur. Math. Soc.3 (2001) 203–268.
L. Birgé and P. Massart, A generalized Cp criterion for Gaussian model selection. Prépublication n° 647, Universités de Paris 6 et Paris 7 (2001).
L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Prob. Th. Rel. Fields138 (2007) 33–73.
L. Birgé and P. Massart, From model selection to adaptive estimation, in Festschrift for Lucien Le Cam. Springer, New York (1997) 55–87.
C. Bouveyron, S. Girard and C. Schmid, High-Dimensional Data Clustering. Comput. Stat. Data Anal.52 (2007) 502–519.
K.P. Burnham and D.R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag, New York, 2nd edition (2002).
G. Castellan, Modified Akaike's criterion for histogram density estimation. Technical report, Université Paris-Sud 11 (1999).
G. Castellan, Density estimation via exponential model selection. IEEE Trans. Inf. Theory49 (2003) 2052–2060.
G. Celeux and G. Govaert, Gaussian parsimonious clustering models. Pattern Recogn.28 (1995) 781–793.
A.P. Dempster, N.M. Laird and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc, Ser. B.39 (1977) 1–38.
C.R. Genovese and L. Wasserman, Rates of convergence for the Gaussian mixture sieve. Ann. Stat.28 (2000) 1105–1127.
S. Ghosal and A.W. van der Vaart, Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Stat.29 (2001) 1233–1263.
C. Keribin, Consistent estimation of the order of mixture models. Sankhyā. The Indian Journal of Statistics. Series A62 (2000) 49–66.
M.H. Law, M.A.T. Figueiredo and A.K. Jain, Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell.26 (2004) 1154–1166.
E. Lebarbier, Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Proc.85 (2005) 717–736.
V. Lepez, Potentiel de réserves d'un bassin pétrolier: modélisation et estimation. Ph.D. thesis, Université Paris-Sud 11 (2002).
P. Massart, Concentration inequalities and model selection. Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003).
C. Maugis, Sélection de variables pour la classification non supervisée par mélanges gaussiens. Applications à l'étude de données transcriptomes. Ph.D. thesis, University Paris-Sud 11 (2008).
C. Maugis, G. Celeux and M.-L. Martin-Magniette, Variable Selection for Clustering with Gaussian Mixture Models. Biometrics (2008) (to appear).
C. Maugis and B. Michel, Slope heuristics for variable selection and clustering via Gaussian mixtures. Technical Report 6550, INRIA (2008).
A.E. Raftery and N. Dean, Variable Selection for Model-Based Clustering. J. Am. Stat. Assoc.101 (2006) 168–178.
G. Schwarz, Estimating the dimension of a model. Ann. Stat.6 (1978) 461–464.
D. Serre, Matrices. Springer-Verlag, New York (2002).
M. Talagrand, Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math., Inst. Hautes Étud. Sci.81 (1995) 73–205.
M. Talagrand, New concentration inequalities in product spaces. Invent. Math.126 (1996) 505–563.
F. Villers, Tests et sélection de modèles pour l'analyse de données protéomiques et transcriptomiques. Ph.D. thesis, University Paris-Sud 11 (2007).

Citations in EuDML Documents

top

S. X. Cohen, E. Le Pennec, Partition-based conditional density estimation

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.