Consensus clustering with differential evolution

Miroslav Sabo

Kybernetika (2014)

  • Volume: 50, Issue: 5, page 661-678
  • ISSN: 0023-5954

Abstract

top
Consensus clustering algorithms are used to improve properties of traditional clustering methods, especially their accuracy and robustness. In this article, we introduce our approach that is based on a refinement of the set of initial partitions and uses differential evolution algorithm in order to find the most valid solution. Properties of the algorithm are demonstrated on four benchmark datasets.

How to cite

top

Sabo, Miroslav. "Consensus clustering with differential evolution." Kybernetika 50.5 (2014): 661-678. <http://eudml.org/doc/262188>.

@article{Sabo2014,
abstract = {Consensus clustering algorithms are used to improve properties of traditional clustering methods, especially their accuracy and robustness. In this article, we introduce our approach that is based on a refinement of the set of initial partitions and uses differential evolution algorithm in order to find the most valid solution. Properties of the algorithm are demonstrated on four benchmark datasets.},
author = {Sabo, Miroslav},
journal = {Kybernetika},
keywords = {consensus clustering; differential evolution; ensemble; data; consensus clustering; differential evolution; ensemble; data},
language = {eng},
number = {5},
pages = {661-678},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Consensus clustering with differential evolution},
url = {http://eudml.org/doc/262188},
volume = {50},
year = {2014},
}

TY - JOUR
AU - Sabo, Miroslav
TI - Consensus clustering with differential evolution
JO - Kybernetika
PY - 2014
PB - Institute of Information Theory and Automation AS CR
VL - 50
IS - 5
SP - 661
EP - 678
AB - Consensus clustering algorithms are used to improve properties of traditional clustering methods, especially their accuracy and robustness. In this article, we introduce our approach that is based on a refinement of the set of initial partitions and uses differential evolution algorithm in order to find the most valid solution. Properties of the algorithm are demonstrated on four benchmark datasets.
LA - eng
KW - consensus clustering; differential evolution; ensemble; data; consensus clustering; differential evolution; ensemble; data
UR - http://eudml.org/doc/262188
ER -

References

top
  1. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P., Automatic subspace clustering of high dimensional data for data mining applications., In: Proc. 2001 ACM SIGMOD International Conference on Management of data 27 (1998), 2, pp. 94-105. 
  2. Bache, K., Lichman, M., UCI machine learning repository, 2013., URL http://archive.ics.uci.edu/ml. 
  3. Bailey, K. D., Typologies and Taxonomies: An Introduction to Classification Techniques., Sage Publications Inc., Los Angeles 1994. 
  4. Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms., Plenum Press, New York 1981. Zbl0503.68069MR0631231
  5. Das, S., Abraham, A., Konar, A., 10.1109/TSMCA.2007.909595, IEEE Trans. Sys. Man Cyber., Part A: Systems and Humans 38 (2008), 1, 218-237. DOI10.1109/TSMCA.2007.909595
  6. Dempster, A. P., Laird, N. M., Rubin, D. B., Maximum likelihood from incomplete data via the em algorithm., J. Roy. Stat. Soc. Ser. B 39 (1977), 1, 1-38. Zbl0364.62022MR0501537
  7. Dimitriadou, E., cclust: Convex Clustering Methods and Clustering Indexes, 2012., URL http://CRAN.R-project.org/package=cclust. 
  8. Dudoit, S., Fridlyand, J., 10.1093/bioinformatics/btg038, Bioinformatics 19 (2003), 9, 1090-2003. DOI10.1093/bioinformatics/btg038
  9. Ester, M., Kriegel, H. P., Sander, J., Xu, X., A density-based algorithm for discovering clusters in large spatial databases with noise., In: Proc. 2nd International Conference on Knowledge Discovery and Data Mining 1996, pp. 226-231. 
  10. Fern, X., Brodley, C., Solving cluster ensemble problems by bipartite graph partitioning., In: Proc. 21st International Conference on Machine learning 2004, pp. 36-43. 
  11. Fraley, C., Raftery, A. E., 10.1198/016214502760047131, J. Amer. Statist. Assoc. 97 (2002), 611-631. Zbl1073.62545MR1951635DOI10.1198/016214502760047131
  12. Fraley, C., Raftery, A. E., MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering., Techn. Report 504, University of Washington, Department of Statistics, 2006. 
  13. Ghaemi, R., Sulaiman, N., Ibrahim, H., Mustapha, N., A survey: Clustering ensembles techniques., In: Proc. International Conference on Computer, Electrical, and Systems Science, and Engineering (CESSE) 38 (2009), pp. 644-653. 
  14. Ghosh, J., Acharya, A., Cluster ensembles., Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery 1 (2011), 4, 305-315. 
  15. Gould, S. J., Full House: The Spread of Excellence from Plato to Darwin., Harmony, New York 1996. 
  16. Halkidi, M., Batistakis, Y., Vazirgiannis, M., 10.1145/565117.565124, SIGMOD Record 31 (2002), 2, 40-45. DOI10.1145/565117.565124
  17. Handl, J., Knowles, J., Multi-objective clustering and cluster validation., In: Multi-Objective Machine Learning (Studies in Computational Intelligence, Vol, 16), Springer, Berlin 2006, pp. 21-47. 
  18. Handl, J., Knowles, J., 10.1109/TEVC.2006.877146, IEEE Trans. Evolutionary Comput. 11 (2007), 56-76. DOI10.1109/TEVC.2006.877146
  19. Handl, J., Knowles, J., Kell, D., 10.1093/bioinformatics/bti517, Bioinformatics 21 (2005), 15, 3201-3212. DOI10.1093/bioinformatics/bti517
  20. Hartigan, J., Wong, M., 10.2307/2346830, Applied Statistics 28 (1979), 100-108. Zbl0447.62062DOI10.2307/2346830
  21. Hornik, K., Feinerer, I., Kober, M., Buchta, C., 10.18637/jss.v050.i10, J. Statist. Software 50 (2012), 10, 1-22. DOI10.18637/jss.v050.i10
  22. Hruschka, E., Campello, R., Freitas, A., Carvalho, A. de, 10.1109/TSMCC.2008.2007252, IEEE Trans. Sys. Man Cyber. Part C: Applications and Reviews 39 (2009), 2, 133-155. DOI10.1109/TSMCC.2008.2007252
  23. Jain, A. K., Data clustering: 50 years beyond k-means., Pattern Recognition Lett. 31 (2010), 8, 651-666. 
  24. Jain, A. K., Murty, M. N., Flynn, P. J., Data clustering: A review., ACM Comput. Surveys 31 (1999), 3, 316-323. 
  25. Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A., kernlab - an S4 package for kernel methods in R., J. Statist. Software 11 (2004), 9, 1-20. 
  26. Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S., Multilevel hypergraph partitioning: Applications in vlsi domain., In: Proc. Design and Automation Conference, 1997, pp. 526-529. 
  27. Kaufman, L., Rousseeuw, P., Finding Groups in Data: An Introduction to Cluster Analysis., Wiley, New York 1990. MR1044997
  28. Krishna, K., Murty, M. Narasimha, 10.1109/3477.764879, Trans. Sys. Man Cyber. Part B 29 (1999), 3, 433-439. DOI10.1109/3477.764879
  29. Kwedlo, W., 10.1016/j.patrec.2011.05.010, Pattern Recognition Letters 32 (2011), 12, 1613-1621. DOI10.1016/j.patrec.2011.05.010
  30. MacQueen, J., Some methods for classification and analysis of multivariate observations., In: Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability 1 (1967), pp. 281-297. Zbl0214.46201MR0214227
  31. Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K., cluster: Cluster Analysis Basics and Extensions, 2013., R package version 1.14.4. 
  32. Monti, S., Tamayo, P., Mesirov, J., Golub, T., 10.1023/A:1023949509487, Mach. Learn. 52 (2003), 1-2, 91-118. Zbl1039.68103DOI10.1023/A:1023949509487
  33. Mullen, K., Ardia, D., Gil, D., Windover, D., Cline, J., 10.18637/jss.v040.i06, J. Statist. Software 40 (2011), 6, 1-26. DOI10.18637/jss.v040.i06
  34. Murthy, C., Chowdhury, N., In search of optimal clusters using genetic algorithms., Pattern Recognition Lett. 17 (1996), 8, 825-832. 
  35. Pal, S. K., Majumder, D. D., 10.1109/TSMC.1977.4309789, IEEE Trans. Sys. Man Cyber. 7 (1977), 625-629. DOI10.1109/TSMC.1977.4309789
  36. Paterlini, S., Krink, T., 10.1016/j.csda.2004.12.004, Comput. Statist. Data Anal. 50 (2006), 5, 1220-1247. MR2224370DOI10.1016/j.csda.2004.12.004
  37. Price, K. V., Storn, R. M., Lampinen, J. A., Differential Evolution: A Practical Approach to Global Optimization., Springer-Verlag, Berlin 2006. Zbl1186.90004MR2191377
  38. Raghavan, V., Birchand, K., A clustering strategy based on a formalism of the reproductive process in a natural system., In: Proc. Second International Conference on Information Storage and Retrieval, 1979, pp. 10-22. 
  39. R Core Team, R: A Language and Environment for Statistical Computing., R Foundation for Statistical Computing, Vienna 2012. URL http://www.R-project.org/. 
  40. Shi, J., Malik, J., Normalized cuts and image segmentation., In: IEEE Trans. Pattern Analysis and Machine Intelligence 22 (2000), 8, 888-905. 
  41. Simovici, D. A., Djeraba, Ch., Mathematical Tools for Data Mining: Set Theory, Partial Orders, Combinatorics., Advanced information and knowledge processing. Springer, London 2008. Zbl1151.68386MR2451001
  42. Simpson, T. I., Armstrong, J. D., Jarman, A. P., 10.1186/1471-2105-11-590, BMC Bioinform. 11 (2010), 11-590. DOI10.1186/1471-2105-11-590
  43. Sneath, P. H., 10.1099/00221287-17-1-201, Journal of general microbiology 17 (1957), 1, 201-226. DOI10.1099/00221287-17-1-201
  44. Storn, R., Price, K., 10.1023/A:1008202821328, J. Global Optim. 11 (1997), 4, 341-359. Zbl0888.90135MR1479553DOI10.1023/A:1008202821328
  45. Strehl, A., Ghosh, J., Cluster ensembles - a knowledge reuse framework for combining partitionings., In: Proc. 11th National Conference On Artificial Intelligence, NCAI, Edmonton, Alberta 2002, pp. 93-98. MR1991087
  46. Topchy, A., Jain, A., Punch, W., A mixture model of clustering ensembles., In: Proc. SIAM International Conference on Data Mining 2004, pp. 22-24. 
  47. Trotter, W. M., Combinatorics and Partially Ordered Sets., The Johns Hopkins University Press, Baltimore 1992. Zbl0764.05001MR1169299
  48. Tvrdík, J., Křivý, I., 10.1007/978-3-642-29353-5_16, Lecture Notes Comput. Sci. 7269 (2012), 136-144. DOI10.1007/978-3-642-29353-5_16
  49. Wang, P., Domeniconi, C., Laskey, K., 10.1007/978-3-642-15939-8_28, Lecture Notes Comput. Sci. 6323 (2010), 3, 435-450. DOI10.1007/978-3-642-15939-8_28
  50. Wang, H., Shan, H., Banerjee, A., 10.1002/sam.10098, Stat. Anal. Data Min. 4 (2011), 1, 54-70. MR2814500DOI10.1002/sam.10098
  51. Wikipedia, Partition of a set., http://en.wikipedia.org/wiki/Partition_of_a_set. 
  52. Xu, R., Wunsch, D., 10.1109/TNN.2005.845141, IEEE Trans. Neural Networks 16 (2005), 3, 645-678. DOI10.1109/TNN.2005.845141
  53. Zahn, Ch. T., Graph-theoretic methods for detecting and describing gestalt clusters., IEEE Trans. Comput. 20 (1971), 31, 68-86. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.