A classification method for binary predictors combining similarity measures and mixture models

Seydou N. Sylla; Stéphane Girard; Abdou Ka Diongue; Aldiouma Diallo; Cheikh Sokhna

Dependence Modeling (2015)

  • Volume: 3, Issue: 1
  • ISSN: 2300-2298

Abstract

top
In this paper, a new supervised classification method dedicated to binary predictors is proposed. Its originality is to combine a model-based classification rule with similarity measures thanks to the introduction of new family of exponential kernels. Some links are established between existing similarity measures when applied to binary predictors. A new family of measures is also introduced to unify some of the existing literature. The performance of the new classification method is illustrated on two real datasets (verbal autopsy data and handwritten digit data) using 76 similarity measures.

How to cite

top

Seydou N. Sylla, et al. "A classification method for binary predictors combining similarity measures and mixture models." Dependence Modeling 3.1 (2015): null. <http://eudml.org/doc/275918>.

@article{SeydouN2015,
abstract = {In this paper, a new supervised classification method dedicated to binary predictors is proposed. Its originality is to combine a model-based classification rule with similarity measures thanks to the introduction of new family of exponential kernels. Some links are established between existing similarity measures when applied to binary predictors. A new family of measures is also introduced to unify some of the existing literature. The performance of the new classification method is illustrated on two real datasets (verbal autopsy data and handwritten digit data) using 76 similarity measures.},
author = {Seydou N. Sylla, Stéphane Girard, Abdou Ka Diongue, Aldiouma Diallo, Cheikh Sokhna},
journal = {Dependence Modeling},
keywords = {Mixture model; binary predictors; kernel method; similarity measure},
language = {eng},
number = {1},
pages = {null},
title = {A classification method for binary predictors combining similarity measures and mixture models},
url = {http://eudml.org/doc/275918},
volume = {3},
year = {2015},
}

TY - JOUR
AU - Seydou N. Sylla
AU - Stéphane Girard
AU - Abdou Ka Diongue
AU - Aldiouma Diallo
AU - Cheikh Sokhna
TI - A classification method for binary predictors combining similarity measures and mixture models
JO - Dependence Modeling
PY - 2015
VL - 3
IS - 1
SP - null
AB - In this paper, a new supervised classification method dedicated to binary predictors is proposed. Its originality is to combine a model-based classification rule with similarity measures thanks to the introduction of new family of exponential kernels. Some links are established between existing similarity measures when applied to binary predictors. A new family of measures is also introduced to unify some of the existing literature. The performance of the new classification method is illustrated on two real datasets (verbal autopsy data and handwritten digit data) using 76 similarity measures.
LA - eng
KW - Mixture model; binary predictors; kernel method; similarity measure
UR - http://eudml.org/doc/275918
ER -

References

top
  1. [1] Andrews, J.L. and P.D. McNicholas (2012). Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Stat. Comp. 22(5), 1021–1029. [Crossref] Zbl1252.62062
  2. [2] Batagelj, V. and M. Bren (1995). Comparing resemblance measures. J. Classif. 12, 73–90. [Crossref] Zbl0833.62054
  3. [3] Baulieu, F.B. (1989). A classification of presence/absence based dissimilarity coefficients. J. Classif. 6, 233–246. [Crossref] Zbl0691.62056
  4. [4] Bergé, L., C. Bouveyron, and S. Girard. (2012). HDclassif: an R package for model-based clustering and discriminant analysis of high-dimensional data. J. Stat. Softw. 46(6), 1–29. 
  5. [5] Bouguila, N., D. Ziou, and J. Vaillancourt (2003). Novel mixtures based on the Dirichlet distribution: application to data and image classification. In Machine Learning and Data Mining in Pattern Recognition, Perner P. ed., 172–181, Springer-Verlag, Berlin Heidelberg. Zbl1029.68562
  6. [6] Bouveyron, C. and C. Brunet (2012). Simultaneous model-based clustering and visualization in the Fisher discriminative subspace. Stat. Comp. 22, 301–324. [Crossref] Zbl1322.62162
  7. [7] Bouveyron, C., M. Fauvel and S. Girard (2015). Kernel discriminant analysis and clustering with parsimonious Gaussian process models. Stat. Comp., 25, 1143–1162. [Crossref] Zbl1331.62302
  8. [8] Bouveyron, C., S. Girard and C. Schmid (2007). High-dimensional discriminant analysis. Commun. Stat. A-Theor. 36, 2607– 2623. [Crossref] Zbl1128.62072
  9. [9] Bouveyron, C., S. Girard and C. Schmid (2007). High-dimensional data clustering. Comput. Stat. Data An. 52, 502–519. Zbl05560174
  10. [10] Byass, P., D.L. Huong and H.V. Minh (2003). A probabilistic approach to interpreting verbal autopsies: methodology and preliminary validation in Vietnam. Scand. J. Public Health 31(62), 32–37. [Crossref] 
  11. [11] Cattell, R. (1966). The scree test for the number of factors. Multivar. Behav. Res. 1(2), 245–276. [Crossref] 
  12. [12] Celeux, G. and G. Govaert (1991). Clustering criteria for discrete data and latent class models. J. Classif. 8, 157–176. [Crossref] Zbl0775.62150
  13. [13] Dundar, M.M. and D.A. Landgrebe (2004). Toward an optimal supervised classifier for the analysis of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 42(1), 271–277. [Crossref] 
  14. [14] Fauvel, M., C. Bouveyron and S. Girard (2015). Parsimonious Gaussian process models for the classification of hyperspectral remote sensing images. IEEE Geosci. Remote Sens. Lett., to appear. Zbl1331.62302
  15. [15] Forbes, F. and D. Wraith (2014). A new family of multivariate heavy-tailed distributions with variable marginal amounts of tail-weight: application to robust clustering. Stat. Comp. 24(6), 971–984. [Crossref] Zbl1332.62204
  16. [16] Franczak, B.C., R.P. Browne and P.D. McNicholas (2014). Mixtures of shifted asymmetric Laplace distributions. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1149–1157. [Crossref] 
  17. [17] Goodman, L.A and W.H. Kruskal (1954). Measures of association for cross classifications. J. Amer. Statist. Assoc. 49, 732– 764. Zbl0056.12801
  18. [18] Goodman, L.A and W.H. Kruskal (1959). Measures of association for cross classifications II. Further discussion and references. J. Amer. Statist. Assoc. 54, 35–75. [Crossref] 
  19. [19] Gönen, M. and E. Alpaydin (2011). Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268. Zbl1280.68167
  20. [20] Guermeur, Y. (2002). Combining discriminant models with new multi-class SVMs. Pattern Anal. Appl. 5(2), 168–179. Zbl1021.68080
  21. [21] Guermeur, Y. (2007). VC theory of large margin multi-category classifiers. J. Mach. Learn. Res. 8, 2551–2594. Zbl1222.62070
  22. [22] Hastie, T., R. Tibshirani and J. Friedman (2009). The Elements of Statistical Learning. Second edition. Springer, Berlin. Zbl1273.62005
  23. [23] Hofmann, T., B. Schölkopf and A. Smola (2008). Kernel methods in machine learning. Annals Stat. 36(3), 1171–1220. [Crossref][WoS] Zbl1151.30007
  24. [24] Huong, D.L., H.V. Minh and P. Byass (2003). Applying verbal autopsy to determine cause of death in rural Vietnam. Scand. J. Public Health 31(62), 19–25. [Crossref] 
  25. [25] LeCun, Y., L. Bottou, Y. Bengio and P. Haffner (1998). Gradient-based learning applied to document recognition. Proceedings of IEEE 86(11), 2278–2324. [Crossref] 
  26. [26] Jaccard, P. (1901). Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Bull. Soc. Vaudoise Sci. Nat. 37, 547–579. 
  27. [27] Lee, S. and G. McLachlan (2013). Finite mixtures of multivariate skew t-distributions: some recent and new results. Stat. Comp. 24(2), 181–202. [Crossref] Zbl1325.62107
  28. [28] Lin, T.I. (2010). Robust mixture modeling using multivariate skew t-distribution. Stat. Comp. 20, 343–356. [Crossref] 
  29. [29] McLachlan, G. (1992). Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York. Zbl1108.62317
  30. [30] McLachlan, G., D. Peel and R. Bean (2003). Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data An. 41, 379–388. Zbl1256.62036
  31. [31] McNicholas, P. and B. Murphy (2008). Parsimonious Gaussian mixture models. Stat. Comp. 18, 285–296. [Crossref] 
  32. [32] Mika, S., G. Ratsch, J. Weston, B. Schölkopf and K.R. Müller (1999). Fisher discriminant analysis with kernels. In Neural Networks for Signal Processing IX, Y.-H. Hu, J. Larsen, E. Wilson and S. Douglas eds., 41–48. The Institute of Electrical and Electronics Engineers, Inc. New York. 
  33. [33] Montanari, A. and C. Viroli (2010). Heteroscedastic factor mixture analysis. Stat. Modeling 10, 441–460. 
  34. [34] Murphy, T.B., N. Dean and A.E. Raftery (2010). Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications. Annals Appl. Stat. 4, 219–223. [WoS][Crossref] Zbl1189.62105
  35. [35] Pekalska, E. and B. Haasdonk (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 1017–1032. [WoS][Crossref] 
  36. [36] Scholkopf, B. and A.J. Smola (1990). Learning with Kernels. The MIT Press, Cambridge MA. Zbl1019.68094
  37. [37] Seung-Seok, C., C. Sung-Hyuk and C. Tappert (2010). A survey of binary similarity and distance measures. J. Syst. Cybern. Informatics 8, 43–48. 
  38. [38] Shawe-Taylor, J. and N. Cristianini (2004). Kernel Methods for Pattern Analysis, Cambridge University Press. Zbl0994.68074
  39. [39] Reeves, B.C. and M.A. Quigley (1997). A review of data-derived methods for assigning causes of death from verbal autopsy data. Int. J. Epidemiol. 26, 1080–1089. [Crossref] 
  40. [40] Sneath, P.H.A. and R.R. Sokal (1973). Numerical Taxonomy: the Principles and Practice of Numerical Classification, W.H. Freeman and Company, San Francisco. Zbl0285.92001
  41. [41] Sylla, S., S. Girard, A. Diongue, A. Diallo and C. Sokhna (2014). Classification supervisée par modèle de mélange: Application aux diagnostics par autopsie verbale. 46èmes Journées de Statistique organisées par la Société Française de Statistique, Rennes. 
  42. [42] Tversky, A. (1977). Feature of similarity, Psychol. Rev. 84, 327–352. 
  43. [43] Vilca, F., N. Balakrishnan and C. Zeller (2014). Multivariate skew-normal generalized hyperbolic distribution and its properties. J. Multivar. Anal. 128, 73–85. [WoS][Crossref] Zbl06300394
  44. [44] Wang, J., J. Lee and C. Zhang (2003). Kernel trick embedded Gaussian mixture model. In Algorithmic Learning Theory, Gavalda, R., Jantke, K. P., Takimoto, E. eds., 159–174. Springer-Verlag, Berlin Heidelberg. Zbl1263.68147
  45. [45] Wraith, D. and F. Forbes (2015). Location and scale mixtures of Gaussians with flexible tail behaviour: properties, inference and application to multivariate clustering. Comput. Stat. Data An. 90, 61–73. [WoS] 
  46. [46] Xu, Z., K. Huang, J. Zhu, I. King and M.R. Lyu (2009). A novel kernel-based maximum a posteriori classification method. Neural Networks 22, 977–987, 2009. [WoS][Crossref] Zbl1335.68214

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.