A non asymptotic penalized criterion for Gaussian mixture model selection

Cathy Maugis; Bertrand Michel

doi:10.1051/ps/2009004

A non asymptotic penalized criterion for Gaussian mixture model selection

Published online by Cambridge University Press: 05 January 2012

Cathy Maugis and

Bertrand Michel

Show author details

Cathy Maugis: Affiliation:
Institut de Mathématiques de Toulouse, INSA de Toulouse, Université de Toulouse, 135 avenue de Rangueil, 31077 Toulouse Cedex 4, France; cathy.maugis@insa-toulouse.fr
Bertrand Michel: Affiliation:
Laboratoire de Statistique Théorique et Appliquée, Université Paris 6, 175 rue du Chevaleret, 75013 Paris, France; bertrand.michel@upmc.fr

Article contents

Abstract
References

Get access

Abstract

Specific Gaussian mixtures are considered to solve simultaneouslyvariable selection and clustering problems. A non asymptoticpenalized criterion is proposed to choose the number of mixturecomponents and the relevant variable subset. Because of the nonlinearity of the associated Kullback-Leibler contrast on Gaussianmixtures, a general model selection theorem for maximum likelihoodestimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtainthe penalty function form. This theorem requires to control thebracketing entropy of Gaussian mixture families. The ordered andnon-ordered variable selection cases are both addressed in thispaper.

Keywords

Model-based clustering variable selection penalizedlikelihood criterion bracketing entropy

Information

Type: Research Article
Information: ESAIM: Probability and Statistics , Volume 15: Supplement: In honor of Marc Yor , 2011 , pp. 41 - 68

DOI: https://doi.org/10.1051/ps/2009004 [Opens in a new window]
Copyright: © EDP Sciences, SMAI, 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

H. Akaike, Information theory and an extension of the maximum likelihood principle, in Second International Symposium on Information Theory (Tsahkadsor, 1971), Akadémiai Kiadó, Budapest (1973) 267–281.

S. Arlot and P. Massart, Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. (2008) (to appear).

Banfield, J.D. and Raftery, A.E., Model-based Gaussian and non-Gaussian clustering. Biometrics 49 (1993) 803–821. CrossRef

Barron, A., Birgé, L. and Massart, P., Risk bounds for model selection via penalization. Prob. Th. Re. Fields 113 (1999) 301–413. CrossRef

J.-P. Baudry, Clustering through model selection criteria. Poster session at One Day Statistical Workshop in Lisieux. http://www.math.u-psud.fr/ baudry, June (2007).

Biernacki, C., Celeux, G. and Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Analy. Mach. Intell. 22 (2000) 719–725. CrossRef

Biernacki, C., Celeux, G., Govaert, G. and Langrognet, F., Model-based cluster and discriminant analysis with the mixmod software. Comput. Stat. Data Anal. 51 (2006) 587–600. CrossRef

Birgé, L. and Massart, P., Gaussian model selection. J. Eur. Math. Soc. 3 (2001) 203–268.

L. Birgé and P. Massart, A generalized C_p criterion for Gaussian model selection. Prépublication n° 647, Universités de Paris 6 et Paris 7 (2001).

L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Prob. Th. Rel. Fields 138 (2007) 33–73.

L. Birgé and P. Massart, From model selection to adaptive estimation, in Festschrift for Lucien Le Cam. Springer, New York (1997) 55–87.

Bouveyron, C., Girard, S. and Schmid, C., High-Dimensional Data Clustering. Comput. Stat. Data Anal. 52 (2007) 502–519. CrossRef

K.P. Burnham and D.R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag, New York, 2nd edition (2002).

G. Castellan, Modified Akaike's criterion for histogram density estimation. Technical report, Université Paris-Sud 11 (1999).

Castellan, G., Density estimation via exponential model selection. IEEE Trans. Inf. Theory 49 (2003) 2052–2060. CrossRef

Celeux, G. and Govaert, G., Gaussian parsimonious clustering models. Pattern Recogn. 28 (1995) 781–793. CrossRef

A.P. Dempster, N.M. Laird and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc, Ser. B. 39 (1977) 1–38.

Genovese, C.R. and Wasserman, L., Rates of convergence for the Gaussian mixture sieve. Ann. Stat. 28 (2000) 1105–1127.

Ghosal, S. and van der Vaart, A.W., Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Stat. 29 (2001) 1233–1263. CrossRef

Keribin, C., Consistent estimation of the order of mixture models. Sankhyā. The Indian Journal of Statistics. Series A 62 (2000) 49–66.

Law, M.H., Figueiredo, M.A.T. and Jain, A.K., Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26 (2004) 1154–1166. CrossRef

Lebarbier, E., Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Proc. 85 (2005) 717–736. CrossRef

V. Lepez, Potentiel de réserves d'un bassin pétrolier: modélisation et estimation. Ph.D. thesis, Université Paris-Sud 11 (2002).

P. Massart, Concentration inequalities and model selection. Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003).

C. Maugis, Sélection de variables pour la classification non supervisée par mélanges gaussiens. Applications à l'étude de données transcriptomes. Ph.D. thesis, University Paris-Sud 11 (2008).

C. Maugis, G. Celeux and M.-L. Martin-Magniette, Variable Selection for Clustering with Gaussian Mixture Models. Biometrics (2008) (to appear).

C. Maugis and B. Michel, Slope heuristics for variable selection and clustering via Gaussian mixtures. Technical Report 6550, INRIA (2008).

Raftery, A.E. and Dean, N., Variable Selection for Model-Based Clustering. J. Am. Stat. Assoc. 101 (2006) 168–178. CrossRef

Schwarz, G., Estimating the dimension of a model. Ann. Stat. 6 (1978) 461–464. CrossRef

D. Serre, Matrices. Springer-Verlag, New York (2002).

M. Talagrand, Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math., Inst. Hautes Étud. Sci. 81 (1995) 73–205. CrossRef

Talagrand, M., New concentration inequalities in product spaces. Invent. Math. 126 (1996) 505–563. CrossRef

F. Villers, Tests et sélection de modèles pour l'analyse de données protéomiques et transcriptomiques. Ph.D. thesis, University Paris-Sud 11 (2007).

Article contents

A non asymptotic penalized criterion for Gaussian mixture model selection

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests